Merging PDFs programatically while maintaining the "Combine files..." bookmark structure?
I originally asked this on Adobe's forums but yet to receive any reponses.
I have to merge a set of many (100+) PDF files into a single report on a weekly basis, and so far, I have been doing the process by hand by selecting the files, right clicking, and selecting "Combine supported files in Acrobat". What I would like to do is replicate this exact same process programmatically (preferrably in Excel/VBA, but C# or Batch commands are acceptable alternatives). I currently 开发者_StackOverflow社区have code that will combine pdf files, but it it does not keep the bookmark structure the same way that "Combine supported files in Acrobat" does.
In other words, say I have three files called "A.pdf", "B.pdf", and "C.pdf", and each file contains two bookmarks called "Bkmrk 1" and "Bkmrk 2". I want to programatically combine these three files into a single file that has 9 bookmarks that look like the structure below:
A
Bkmrk 1
Bkmrk 2
B
Bkmrk 1
Bkmrk 2
C
Bkmrk 1
Bkmrk 2
I at first tried automating the process via the Acrobat SDK, but from what I understand the Acrobat SDK does not allow programs to interact with the dialog box that appears when you execute the "Combine Files" menu option, so that did not work. I also tried the option to programatically insert pages from one pdf file into another, but that does not produce the bookmark structure that I am looking for, nor does it let me manipulate the bookmark heirarchy to create the bookmark structure I am looking for.
Does anyone have an idea of how to do this? Any help would be greatly appreciated!
This was pure hell to get working, so I'm happy to share what I've got. This was adapted from code I found here, and will merge files, and put bookmarks at each merge point:
Private mlngBkmkCounter As Long
Public Sub updfConcatenate(pvarFromPaths As Variant, _
pstrToPath As String)
Dim origPdfDoc As Acrobat.CAcroPDDoc
Dim newPdfDoc As Acrobat.CAcroPDDoc
Dim lngNewPageCount As Long
Dim lngInsertPage As Long
Dim i As Long
Set origPdfDoc = CreateObject("AcroExch.PDDoc")
Set newPdfDoc = CreateObject("AcroExch.PDDoc")
mlngBkmkCounter = 0
'set the first file in the array as the "new"'
If newPdfDoc.Open(pvarFromPaths(LBound(pvarFromPaths))) = True Then
updfInsertBookmark "Test Start", lngInsertPage, , newPdfDoc
mlngBkmkCounter = 1
For i = LBound(pvarFromPaths) + 1 To UBound(pvarFromPaths)
Application.StatusBar = "Merging " & pvarFromPaths(i) & "..."
If origPdfDoc.Open(pvarFromPaths(i)) = True Then
lngInsertPage = newPdfDoc.GetNumPages
newPdfDoc.InsertPages lngInsertPage - 1, origPdfDoc, 0, origPdfDoc.GetNumPages, False
updfInsertBookmark "Test " & i, lngInsertPage, , newPdfDoc
origPdfDoc.Close
mlngBkmkCounter = mlngBkmkCounter + 1
End If
Next i
newPdfDoc.Save PDSaveFull, pstrToPath
End If
ExitHere:
Set origPdfDoc = Nothing
Set newPdfDoc = Nothing
Application.StatusBar = False
Exit Sub
End Sub
The insert-bookmark code... You would need to array your bookmarks from each document, and then set them
Public Sub updfInsertBookmark(pstrCaption As String, _
plngPage As Long, _
Optional pstrPath As String, _
Optional pMyPDDoc As Acrobat.CAcroPDDoc, _
Optional plngIndex As Long = -1, _
Optional plngParentIndex As Long = -1)
Dim MyPDDoc As Acrobat.CAcroPDDoc
Dim jso As Object
Dim BMR As Object
Dim arrParents As Variant
Dim bkmChildsParent As Object
Dim bleContinue As Boolean
Dim bleSave As Boolean
Dim lngIndex As Long
If pMyPDDoc Is Nothing Then
Set MyPDDoc = CreateObject("AcroExch.PDDoc")
bleContinue = MyPDDoc.Open(pstrPath)
bleSave = True
Else
Set MyPDDoc = pMyPDDoc
bleContinue = True
End If
If plngIndex > -1 Then
lngIndex = plngIndex
Else
lngIndex = mlngBkmkCounter
End If
If bleContinue = True Then
Set jso = MyPDDoc.GetJSObject
Set BMR = jso.bookmarkRoot
If plngParentIndex > -1 Then
arrParents = jso.bookmarkRoot.Children
Set bkmChildsParent = arrParents(plngParentIndex)
bkmChildsParent.createchild pstrCaption, "this.pageNum= " & plngPage, lngIndex
Else
BMR.createchild pstrCaption, "this.pageNum= " & plngPage, lngIndex
End If
MyPDDoc.SetPageMode 3 '3 — display using bookmarks'
If bleSave = True Then
MyPDDoc.Save PDSaveIncremental, pstrPath
MyPDDoc.Close
End If
End If
ExitHere:
Set jso = Nothing
Set BMR = Nothing
Set arrParents = Nothing
Set bkmChildsParent = Nothing
Set MyPDDoc = Nothing
End Sub
To use:
Public Sub uTest_pdfConcatenate()
Const cPath As String = "C:\MyPath\"
updfConcatenate Array(cPath & "Test1.pdf", _
cPath & "Test2.pdf", _
cPath & "Test3.pdf"), "C:\Temp\TestOut.pdf"
End Sub
You might need to consider a commercial tool such as Aspose.Pdf.Kit to get the level of flexibility you're after. It does support file concatenation and bookmark manipulation.
There's a 30 day unlimited trial so you can't really lose out other than time if it doesn't work for you.
Use iText# (http://www.itextpdf.com/). imho it is one of the best PDF-tools around. A code to do (approximately) what you want can be found here http://java-x.blogspot.com/2006/11/merge-pdf-files-with-itext.html Do not worry that all examples talk about Java, the classes and functions are the same in .NET
hth
Mario
Docotic.Pdf library can merge PDF files while maintaining outline (bookmarks) structure.
There is nothing special should be done. You just append all documents one after another and that's all.
using (PdfDocument pdf = new PdfDocument())
{
string[] filesToMerge = ...
foreach (string file in filesToMerge)
pdf.Append(file);
pdf.Save("merged.pdf");
}
Disclaimer: I work for Bit Miracle, vendor of the library.
The Acrobat SDK does let you create and read bookmarks. Check your SDK API Reference for:
PDDocGetBookmarkRoot()
PDBookmark* (AddChild, AddNewChild, GetNext, GetPrev... lots of functions in there)
If the "combine files" dialog doesn't give you the control you need, make your own dialog.
精彩评论