Parsing a faxed form
Looking at a scenario where a form (consisting of, for simplicity sake, checkboxes only) is faxed to a fax server capable of OCR. Now, with typographic text, I've see various OCR implementations doing a decent job, but I'm not sure how it would handle checkboxes, especially handwritten "x" or checks, not to mention the coordinates.
Back in grade school, we used to fill in those Gauss 开发者_如何学编程(sic) tests with HB pencil shading in the correct answer; somewhere, somehow, that was parsed and analyzed.
Where are we at today? Is there anything out-of-the-box?
You are referring to Optical Mark Recognition (OMR) technology commonly user by Scantron and NCS in many US schools.
Most OCR servers would have no real concept of reading OMR unless it is specifically designed to recognise different form types. It sounds like your OCR fax server software probably only does a full page OCR and would have no concept of OMR fields.
You could possibly rig something up without investing too much effort or cost. If you design you questions as per the following guidelines it could possibly work quite well.
Which fruit do you prefer to eat ?
< > Apple
< > Pear
< > Orange
< > Banana
When the OCR engine comes back with the OCR text you could assume that any characters read between the < and > characters is an OMR mark even if it is an unrecognised character.
Which fruit do you prefer to eat ?
< > Apple
< x > Pear
< ? > Orange
< > Banana
This would indicate that Pear and Orange were marked.
TeleForm is a commercial package that could import the images and process the fax pages but you would need to design the form in Teleform first. http://www.cardiff.com/products/index.html
精彩评论