Want some ideas on how to develop a image retrieval system [closed]
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this questionI just want to study more on how Google can search images. I know it is too complicated for me,but I want to try to do it on my own set of images.
This is how I propose: For a large set of random images in a folder, I could use some keywords (maybe not appropriate, subject to change based on your ideas), to search some开发者_Go百科 images and sort them, just like what google image does.
I've talked with some Graphic people on how to determine the similarities between images, they told me something like:
- Global Color Histogram
- Image Layout and block-based histogram (which I am not very familiar with)
- RAG-based description.
So, now I really need your idea, I don't need any code or something, just could you please share what you may think of in designing such a local-image-search system, how would you define similarities between images, how would you represent images, etc.
I will continue to talk to some graphic people to learn, but I also really want your ideas to get me started.
Google has done more than simple text search for images, see this post on Google Official Blog. Remember that image search on visual features (Content Based Image Retrieval - CBIR) is a open research problem.
Global color histogram can be disapointing (for example, USA and France flags present similar global color histograms - but they are very different). Local color histograms can produce better results for that flags example.
You could take a look on Nuno Vasconcelos' work at UCSD.
TinEye is the only image-based search engine I've seen. The FAQ has some technical details, but the important limitations (which also tells you a little bit about the sort of detectors they use to construct their "image signature") are:
Can TinEye find similar images? Does TinEye do facial recognition?
TinEye finds exact and altered copies of the images that you submit, including those that have been cropped, colour adjusted, resized, heavily edited or slightly rotated. TinEye does not commonly return similar matches, and it cannot recognize the contents of any image. This means that TinEye cannot find different images with the same people or things in them.
Google image search uses a variety of techniques to return the results you see, but (disappointingly) the biggest one is context in another webpage. Just like regular google search results, the words near an image help determine what is in the image.
Some images are tagged with tools like google's collaborative labeling game, but otherwise it's all context.
This probably won't help you in your goal much, but unfortunately there does not yet exist code which can reliably tell a white kitten from a white horse. Facial recognition is a different matter, but that isn't what you were asking about.
I think google images is based on associative text searches. Most attempts to do recognition on image content have way to poor accuracy today to be useful. If you are sincerely interested to develop such algorithms, first target an easy problem, where the number of possible objects, and nature of photos are somewhat controlled. You want to look into:
Computer Vision: how to retrieve useful information from digital images, first hand primitive information like color distribution, edges, circles etc.
Object recognition: how to detect objects.
Machine learning: make your application to self-improve.
ImageJ, a computer vision open source app may be a good place to start.
Together with other suggestions, you may want to check out facial recognition.
A commercial example of such techniques is Apple's iPhoto.
精彩评论