How To programmatically detect XFA (Adobe XML Forms Architecture) dynamic PDF
I have a system that converts pdf to tif. Basically it's a program written in csharp that uses iTextSharp to get metadata about the pdf and pdf2tif (http://pdftotif.sourceforge.net/) to convert to the file. I've noticed a number of pdf's do not conv开发者_StackOverflow社区ert correctly. In Acrobat and Foxit they open as multi page forms but in any other viewer (Ghostscript...) they open as 1 page documents with the message
"To view the full contents of this document, you need a later version of the PDF viewer. You can upgrade to the latest version of Adobe Reader from "www.adobe.com/products/acrobat/readstep2.html" For further support, go to http://www.adobe.com/support/products/acrreader.html"
Some goggling around told me that these are XFA dynamic PDF's is there any way i can programmatically detect that so I can try to handle these pdf’s differently?
The iText API is a good start.
In iTextSharp you access the object's property instead of calling a method. (if you've done a moderate amount of work with iTextSharp you probably already know this)
Anyway, here's a simple example using an HTTP Handler:
<%@ WebHandler Language="C#" Class="iTextXfa" %>
using System;
using System.Web;
using iTextSharp.text;
using iTextSharp.text.pdf;
public class iTextXfa : IHttpHandler {
public void ProcessRequest (HttpContext context) {
HttpServerUtility Server = context.Server;
string[] testFiles = {
Server.MapPath("./non-XFA.pdf"), Server.MapPath("./XFA.pdf")
};
foreach (string file in testFiles) {
XfaForm xfa = new XfaForm(new PdfReader(file));
context.Response.Write(string.Format(
"<p>File: {0} is XFA: {1}</p>",
file,
xfa.XfaPresent ? "YES" : "NO"
));
}
}
public bool IsReusable { get { return false; } }
}
Command line approach:
strings document.pdf | grep XFA
If you get a line or two you're probably working with an XFA PDF:
<</Names[(!ADBE::0100_VersChkStrings) 364 0 R(!ADBE::0100_VersChkVars) 365 0 R(!ADBE::0200_VersChkCode_XFACheck) 366 0 R]>>
精彩评论