Optimal configuration for Tessnet -- is image format conversion good enough?
I need to do OCR on a group of images. I have been using Tessnet and it works pretty well. The problem is that it seems to have problems with some images, so I thought that it might work better if I modify the ima开发者_开发知识库ges' brightness, contrast, etc. Also, the images are in .jpg format, but I read that .tiff is optimal.
What can I do? Should I just convert the JPEGs to TIFFs?
There's no point in converting the jpeg images to a lossless format like tiff, you will convert the artifacts as well. You could try and apply a sharpness kernel on the image before you try to do ocr on it.
Look at this page for more information.
精彩评论