Extract text from X11 GUIs?
I have a point-of-sale application written in Perl/Tk. I use X11::GUITest to do automated testing of it, driving the app via hot-keys bound to the buttons and other widgets (it's normally touch-screen driven). However, X11::GUITest doesn't have a way to "read" text back from the screen, so I resort to augmenting the app to write temp files as well as putting data on the screen. The test scripts then look at the temp files, not the 开发者_高级运维GUI. But I'd love to extend X11::GUITest or make a new CPAN module that can scrape text strings from X11 GUIs. I'm not after graphics-to-text conversion; it's my (faint) understanding that somewhere in the depths of the X window system, label text and such are stored as text strings and rendered to bitmap form late in the pipeline (?).
Anyone have guidance on how to do this, or pointers on where to start?
Yeah, I know I should've adhered to better MVC separation and not actually test at the GUI level, but just below it; however reality got in the way and it is what it is!
The best way to do this is to make your program work with an accessibility framework like ATK (used in GTK applications) and then use that to query for the strings as a screen reader would have to for text-to-speech translation. This is the approach taken by the Linux Desktop Testing Project and dogtail testing frameworks. You get the bonus of using existing, well tested code and making your application more usable by disabled users (as may be required by laws such as the Americans with Disabilities Act in the US and similar laws in many other countries).
If your application is using modern font frameworks, like libXft2, this may be your only choice, as those strings are only in the client application, not the X server, and the character to pixmap conversion is done in the client. (If your text is antialiased, it must be using these instead of the legacy X11 API's.)
Even with the legacy X11 API's though, the X server doesn't store the strings once the text to bitmap conversion is done, so there's no good way to query them short of intercepting them in that case.
The program listres
lists resources in widgets, including label texts and I think the contents of text entry boxes. You may be able to use its output directly, extracting what you need, or you may need to look at the source and see how it's done.
精彩评论