Determining object identities using image recognition

2023-02-28 18:50 问答作者：

I've written some image analysis software that can determine the basic shape, color, and dimensions of what it considers to be the most dominant object in the image.

I've also created a database of objects for the algorithm to choose from:

Item | Shape | Colors | Width range | Height range

Box | rectangle | brown, black, white | 20-50 cm | 10-30 cm
Basketball | circle | orange | 20-25cm | 20-25 cm
Backpack | rectangle | black | 40-50 cm | 20-30 cm
.
.
.
etc.

An example would be where the system detects a black rectangle that is 42cm wide and 26cm in height. In this case, both 'box'开发者_如何学Python and 'backpack' would qualify as correct answers. Are there any good ways to make an educated guess as to which of the two items it could be, such as 75% chance it's a backpack, 25% chance it's a box (possibly based on the fact that boxes have a chance of being 3 different colors and a wider range of sizes, as opposed to the backpack which could only be black)?

Other advice is also welcome. I'm having to teach myself about image recognition, so if there are other things I should be trying to pull out of an image, or a different way that I should be going about the database, those comments would also be greatly appreciated!

Apologies for the rather high-level description without much of a justification of why it works, but you can easily fill books answering that question and it's 1pm already, so I have to make it short:

Additionally to recording the range of acceptable sizes for boxes and backpacks, you need to define a probability distribution. Most likely you'd just go with a (2D) normal distribution, then you'd record the mean and a variation instead of the range. Do the same for the shape, color, etc. variables with a suitable probability distribution.

Then generate two data set with a few hundred data points like this:

p_1 = (shape=rectangle, color=black, width=12, height=34)
p_2 = (shape=circle, color=red, width=34, height=11)
...

For one of the sets, manually classify them as the object that would match the description best. That will become your verification set.

Take the other data set and train a classification algorithm like Fisher's linear discriminant using that data. You obtain a transformation T that will maximize the "distance" between the classes (groups of data points representing an object) and minimize the "distance" between the points belonging to the same group.

When your program detects a new object with the properties

o = (shape=rectangle, color=black, width=42, height=26)

you apply the transformation obtained from Fisher's LD and measure the correlation (scalar vector product) to the transformations of the data points you classified as, i.e. calculate (T*o)*(T*p_backpack)' and (T*o)*(T*p_box)' which relate to the probability that the object o is actually a backpack/a box.

If you are considering AI, take a look at http://pybrain.org/

It is a very high level python AI library. I've had some good luck using it for pattern recognition (using a Neural Network). It's easy enough to play with and will let you experiment with different approaches quickly.

I'd try an AI algorithm populated by user input.

继续阅读：image-recognition object-recognition probability

Determining object identities using image recognition

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？