Image to text conoversion: how to crop single words to single files?
I need to do something similar to this How to write a bash script that cuts images into pieces using image magick?
But I dont' know in advance where the areas are and their size: I need to determine "boxes" which contain each word, and then to crop each开发者_如何学C one and save them into single files.
Most OCR software does something like this, so you could try looking at some source-code for an OCR program. Many years ago, I spent a lot of time with the code for GOCR (http://jocr.sourceforge.net/), which has a pretty simple-minded implementation of this algorithm.
If you don't want to write code, I'm not sure what to suggest. But if you can find software that chops images into pieces based on whitespace, you could try blurring the image (to make the text into blobs) and then thresholding and finding boxes from that. It's not clear that the results would be very useful though.
精彩评论