How to extract text from the PDF document? [closed]
开发者_JAVA百科
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this questionHow to extract text from the PDF document using PHP?
(I can't use other tools, I don't have root access)
I've found some functions working for plain text, but they don't handle well Unicode characters:
http://www.hashbangcode.com/blog/zend-lucene-and-pdf-documents-part-2-pdf-data-extraction-437.html
Download the class.pdf2text.php @ https://pastebin.com/dvwySU1a or https://webcheatsheet.com/php/scripts/pdf2text.zip
Code:
include('class.pdf2text.php');
$a = new PDF2Text();
$a->setFilename('filename.pdf');
$a->decodePDF();
echo $a->output();
class.pdf2text.php
Project Homepdf2textclass
doesn't work with all the PDF's I've tested, If it doesn't work for you, try PDF Parser
精彩评论