开发者

Parse pictures in images to text in VB.Net

I am just wondering if there is any DLLs or features in VB.Net 2008 that I could use to parse a picture of text to text (for example, a screenshot), assuming the text are in very recognizable format开发者_StackOverflow (i.e., not like CAPTCHA type of text).


If it is incredibly readable, an unaltered, pure, screenshot, then the easiest (but probably slowest) way is to draw each letter (using Graphics.DrawString) on to a bitmap and compare that, pixel by pixel, against each pixel. This could be reasonably quick considering how OCR is, and it would almost certainly give a 100% accuracy rate. Even better would be if you're trying to recognize text in a certain area, reducing the search area and increasing speed several times, and even better if the text is in a fixed-width format and you know the font size or can figure it out by searching a small area - you can skip the entire block when a letter is recognized!

If you don't know how to do this type of image manipulation, that's OK. Look at GetPixel and SetPixel on MSDN to start out, then move on to the speed section and look for examples using LockBits.


By far and away your best bet on this one is to buy some OCR software to do it for you. Here's another option, although you'll have to wait: http://www.labnol.org/software/convert-scanned-pdf-images-to-text-with-google-ocr/5158/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜