Extracting tags from PDF [closed]
Can someone recommend a library (Linux binary, jar or source) to extract tag tree from a tagged PDF file? I tried PDFMiner, but it crashed on the first file I tried
Did you try with iText? Take a look on PDFVole for an example of a project that shows this tree visually using iText. You will not be able to link the tree nodes with their curresponding page content with this appoach though.
精彩评论