Quantcast
Channel: VBForums - CodeBank - Visual Basic 6 and earlier
Viewing all articles
Browse latest Browse all 1529

[VB6] pdftotext.dll - VB6-compatible DLL for extracting text from PDFs

$
0
0
After getting frustrated relying on Adobe Acrobat to extract text from PDFs, I started hunting around for an alternative solution.

The first release of pdftotext.dll for VB6 is on GitHub. Binary download on the Releases page.

Usage
Code:

Private Declare Function getNumPages Lib "pdftotext.dll" (ByVal lpFileName As String, Optional ByVal lpLogCallbackFunc As Long, Optional ByVal lpOwnerPassword As String, Optional ByVal lpUserPassword As String) As Integer
Private Declare Function extractText Lib "pdftotext.dll" (ByVal lpFileName As String, ByRef lpTextOutput As String, Optional ByVal iFirstPage As Integer, Optional ByVal iLastPage As Integer, Optional ByVal lpTextOutEnc As String, Optional ByVal lpLayout As String, Optional ByVal lpLogCallbackFunc As Long, Optional ByVal lpOwnerPassword As String, Optional ByVal lpUserPassword As String) As Integer

Dim strOutput as String
pages = getNumPages("filename.pdf", AddressOf LogCallback, "pass", "anotherpass")
ret = extractText("filename.pdf", strOutput, 1, 3, "UTF-8", "rawOrder", AddressOf LogCallback, "pass", "anotherpass")

Public Sub LogCallback(ByVal str As String)
    Debug.Print "Log: " & str
End Sub

Almost all arguments are optional. For example, the following works:
Code:

Dim strOutput as String
pages = getNumPages("filename.pdf")
ret = extractText("filename.pdf", strOutput)


Viewing all articles
Browse latest Browse all 1529

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>