Pytesser inaccurate
Simple question. When I run this image through pytesser, i get $+s
. How can I fix that?
EDIT
So... my code generates images similar to the image linked above, just with different numbers, and is supposed to solve the simple math problem, which is obviously impossible if all I can get out of the picture is $+s
Here's the code I'm currently using:
from pytesser import *
time.sle开发者_如何学Pythonep(2)
i = 0
operator = "+"
while i < 100:
time.sleep(.1);
img = ImageGrab.grab((349, 197, 349 + 452, 197 + 180))
equation = image_to_string(img)
Then I'm going to go on to parse equation
... as soon as I get pytesser working.
Try my little function. I'm running tesseract
from the svn
repo, so my results might be more accurate.
I'm on Linux, so on Windows, I'd imagine that you'll have to replace tesseract
with tesseract.exe
to make it work.
import tempfile, subprocess
def ocr(image):
tempFile = tempfile.NamedTemporaryFile(delete = False)
process = subprocess.Popen(['tesseract', image, tempFile.name], stdout = subprocess.PIPE, stdin = subprocess.PIPE, stderr = subprocess.STDOUT)
process.communicate()
handle = open(tempFile.name + '.txt', 'r').read()
return handle
And a sample Python session:
>>> import tempfile, subprocess
>>> def ocr(image):
... tempFile = tempfile.NamedTemporaryFile(delete = False)
... process = subprocess.Popen(['tesseract', image, tempFile.name], stdout = subprocess.PIPE, stdin = subprocess.PIPE, stderr = subprocess.STDOUT)
... process.communicate()
... handle = open(tempFile.name + '.txt', 'r').read()
... return handle
...
>>> print ocr('326_fail.jpg')
0+1
if you're in linux, use gocr is more accurate. you can use it through
os.system("/usr/bin/gocr %s") % (sample_image)
and use readlines from stdout for manipulating output result to everything what you want (i.e creating output from gocr for specific variable).
精彩评论