PDF files with XML files attached
HI All,
I have a PDF file with a xml attached, i need to parse the xml file. Do开发者_JAVA百科es anyone knows how i do that? I´m using C#.
Thanks in advance.
I believe this blog post describing how read from a PDF file using C# is what you want.
This is the example he gives of grabbing text from the PDF:
using System;
using org.pdfbox.pdmodel;
using org.pdfbox.util;
namespace PDFReader
{
class Program
{
static void Main(string[] args)
{
PDDocument doc = PDDocument.load("lopreacamasa.pdf");
PDFTextStripper pdfStripper = new PDFTextStripper();
Console.Write(pdfStripper.getText(doc));
}
}
}
Here is what looks like an exhaustive and highly organized list of how to read PDFs with C#.
If what you need is some form of embedded meta data, as Mark suggested, I'm sure it's also possible with the to fetch using the tools I've linked to.
Try using LINQ to XML as suggested in this question.
PDF files can have a meta data information object or is it an XML file embedded as an object?
精彩评论