PDF Parser API in Java [closed]
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has b开发者_开发问答een done so far to solve it.
Closed 9 years ago.
Improve this questionI want to convert the pdf data into our own file specifications. So pls help me out to choose the correct API for PDF parsing using java or .net. The parsing should extract each and every component(element) from the PDF pages.
There's a library called IText that does what you want. It's sort of the #1 product out there and is free as in beer.
I've worked with IText before, extracting content from PDFs, and while it's not super-duper automatic, it allows you to get at everything.
Recommended, in other words.
Elements do not exist in the PDF file. It is a set of Pdfobjects which generate the pages.
Try PDF Box http://java-source.net/open-source/pdf-libraries/pdf-box
Hope it will help.
精彩评论