how to parse a lot of PDFs
I have a ton of PDFs I want to be able to parse sentence-by-sentence. Is there a tool for MySQL (or some other database syste开发者_JAVA百科m) for converting PDFs into mysql, and then reading out sentences one at a time? Is there some other tool to do this? I imagined loading all the pdfs into a DB and then reading would be the fastest way but I don't really know...
try pdftotext. Then insert it into DB.
精彩评论