Adobe PDF File Validation Programmatically, ideally via C#
I have a lot of pdfs that are copied from one server to another server. Due to connection issue, a few get corrupted without开发者_JAVA百科 error warning. However, it says the file is not readable when it is opened in Acrobat Reader. I want to find out if there is API that I can test it whether it is a valid pdf to open, ideally in C#
I wonder if ExpertPDF can help, which contains APIs.
ExpertPDF HtmlToPdf Converter
Thanks!
If you want to see if a PDF is valid I would take a look at iTextSharp. You can try opening the file using a PdfReader
(any overload except one that takes a RandomAccessFileOrArray
which I don't think parses the entire file immediately).
PdfReader r = new PdfReader("c:\\File.pdf");
But do you have control over this server-to-server transfer process? Files shouldn't be corrupting in the first place. Maybe you've got an FTP ASCII/BINARY problem? Is the file size changing? Can you perform a pre and post checksum, even something simple like MD5? Fix the problem instead of cleaning up when it breaks.
You can set VERIFY to ON before running xcopy or robocopy to ensure the file integrity.
Otherwise you can run a command line utility such as PDFLEO to dump the metadata. If it reports an error the file is likely damaged.
精彩评论