Detect Areas of Text in Screenshot
I'm working on a project to increase the ability for wine to automatically test software packages. What I'm looking to do now is detect text in the screen capture of the current window. I can then parse all of the text and use autohotkey to give a mouse click on the coordinates of the text I want.
For example, in firefox, I might want to test different things, the first open being opening preferences. I would then need to parse the screenshot of firefox, detect all of the separate locations of text. I can then run thes开发者_如何学JAVAe separate images of text into tesseract-ocr and detect which one, says "Edit". I then redo this again for "preferences".
I've tried to find a solution but so far can't find anything. I'd prefer a solution that uses python or has python binds as thats what I've been programing in so far.
A possible starting point is Project SIKULI. It is a tool to automate GUI testing. It is written in Java, nonetheless it includes a scripting environment based on Jython, hence modifying it to support python script may be not too difficult.
精彩评论