Advice for classifying symbols/images
I am working on a project that r开发者_运维百科equires classification of characters and symbols (basically OCR that needs to handle single ASCII characters and symbols such as music notation). I am working with vector graphics (Paths and Glyphs in WPF) so the images can be of any resolution and rotation will be negligable. It will need to classify (and probably learn from) fonts and paths not in a training set. Performance is important, though high accuracy takes priority.
I have looked at some examples of image detection using Emgu CV (a .Net wrapper of OpenCV). However examples and tutorials I find seem to deal specifically with image detection and not classification. I don't need to find instances of an image within a larger image, just determine the kind of symbol in an image.
There seems to be a wide range of methods to choose from which might work and I'm not sure where to start. Any advice or useful links would be greatly appreciated.
You should probably look at the paper: Gradient-Based Learning Applied to Document Recognition, although that refers to handwritten letters and digits. You should also read about Shape Context by Belongie and Malik. They keyword you should be looking for is digit/character/shape recognition (not detection, not classification).
If you are using EmguCV, the SURF features example (StopSign detector) would be a good place to start. Another (possibly complementary) approach would be to use the MatchTemplate(..) method.
However examples and tutorials I find seem to deal specifically with image detection and not classification. I don't need to find instances of an image within a larger image, just determine the kind of symbol in an image.
By finding instances of a symbol in image, you are in effect classifying it. Not sure why you think that is not what you need.
Image<Gray, float> imgMatch = imgSource.MatchTemplate(imgTemplate, Emgu.CV.CvEnum.TM_TYPE.CV_TM_CCOEFF_NORMED);
double[] min, max;
Point[] pointMin, pointMax;
imgMatch.MinMax(out min, out max, out pointMin, out pointMax);
//max[0] is the score
if (max[0] >= (double) myThreshold)
{
Rectangle rect = new Rectangle(pointMax[0], new Size(imgTemplate.Width, imgTemplate.Height));
imgSource.Draw(rect, new Bgr(Color.Aquamarine), 1);
}
That max[0] gives the score of the best match.
Put all your images down into some standard resolution (appropriately scaled and centered).
Break the canvas down into n square or rectangular blocks.
For each block, you can measure the number of black pixels or the ratio between black and white in that block and treat that as a feature.
Now that you can represent the image as a vector of features (each feature originating from a different block), you could use a lot of standard classification algorithms to predict what class the image belongs to.
Google 'viola jones' for more elaborate methods of this type.
精彩评论