开发者

Tesseract Appears to be learning characters as you perform more OCRs, how do I save the learning data between uses?

I have a particular set of 10 images to perform OCRs. They are all digits; somewhat short, about 20 digits in each image. There is one particular image, if I run it first, it will have some mismatches; however, if I run other tests first, then come back to that one, all characters match.

I am inclined to conclude that Tesseract is learning the characters as more OCR operations are performed, which makes me very happy. Now the question is, if it's possible, for me to save the learning data, so Tesseract wo开发者_如何学Culd know to pick it up the next time I use it?


You can set classify_save_adapted_templates to 1 in your Tesseract config file to save the adapted templates and set classify_use_pre_adapted_templates to 1 to load the templates next time you run Tesseract

The code that specifies the behavior of these options is here: http://code.google.com/p/tesseract-ocr/source/browse/trunk/classify/classify.cpp?r=570

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜