开发者

Pytesser inaccurate

Simple question. When I run this image through pytesser, i get $+s. How can I fix that?

EDIT

So... my code generates images similar to the image linked above, just with different numbers, and is supposed to solve the simple math problem, which is obviously impossible if all I can get out of the picture is $+s

Here's the code I'm currently using:

from pytesser import *

time.sle开发者_如何学Pythonep(2)
i = 0
operator = "+"
while i < 100:
    time.sleep(.1);
    img = ImageGrab.grab((349, 197, 349 + 452, 197 + 180))
    equation = image_to_string(img)

Then I'm going to go on to parse equation... as soon as I get pytesser working.


Try my little function. I'm running tesseract from the svn repo, so my results might be more accurate.

I'm on Linux, so on Windows, I'd imagine that you'll have to replace tesseract with tesseract.exe to make it work.

import tempfile, subprocess

def ocr(image):
  tempFile = tempfile.NamedTemporaryFile(delete = False)

  process = subprocess.Popen(['tesseract', image, tempFile.name], stdout = subprocess.PIPE, stdin = subprocess.PIPE, stderr = subprocess.STDOUT)
  process.communicate()

  handle = open(tempFile.name + '.txt', 'r').read()

  return handle

And a sample Python session:

>>> import tempfile, subprocess
>>> def ocr(image):
...   tempFile = tempfile.NamedTemporaryFile(delete = False)
...   process = subprocess.Popen(['tesseract', image, tempFile.name], stdout = subprocess.PIPE, stdin = subprocess.PIPE, stderr = subprocess.STDOUT)
...   process.communicate()
...   handle = open(tempFile.name + '.txt', 'r').read()
...   return handle
... 
>>> print ocr('326_fail.jpg')
0+1


if you're in linux, use gocr is more accurate. you can use it through

os.system("/usr/bin/gocr %s") % (sample_image)

and use readlines from stdout for manipulating output result to everything what you want (i.e creating output from gocr for specific variable).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜