New research recently published out of University of California – San Diego could allow Google’s Image Search to easily begin using elements from “true image search” — that is, the ability for software to detect and identify elements appearing within the image itself rather than just relying upon external text metadata to associate keywords with the images. Read on for more details.
 This new method is called “Supervised Multiclass Labeling (SML)”. In an article published in the IEEE (Institute of Electrical and Electronics Engineers) Computer Society journal, TPAMI (Transactions on Pattern Analysis and Machine Intelligence) called “Supervised Learning of Semantic Classes for Image Annotation and Retrieval“, researchers Gustavo Carneiro, Antoni B. Chan, Pedro J. Moreno, and Nuno Vasconcelos describe a system wherein one could train it by supplying a sort of seed set of photos which have been labelled by humans with keywords of things seen in the photos, and then the seed set is used by the system on a database of photo images which are unlabelled. The system then will calculate the probability that various objects or “classesâ€? it has been trained to recognize are present – and labels the images accordingly. After labeling, images could then be retrieved via keyword searches using the newly developed meta data.
The accuracy of this new method has apparently been superior to that of other previously published content-based image labeling systems developed by information retrieval specialists. The SML system can also split up images based on their identifiable regions of content – a process which has historically been quite difficult for software systems to accomplish. For example, this method could separate a landscape photo into mountain, sky and lake regions and then identify those things based on the training data.
One of the engineers who contributed on the development of the method and the associated published paper, Pedro Moreno, is a researcher at Google who sometimes contributes on the Google Research Blog. Google just happens to have large quantities of images to use for such research, of course.
John Battelle recently mentioned the dream of being able to truly search by an image’s content, and it would appear that the concept is really close to fruition through this SML method.
Now, one must ask oneself, how might a search engine like Google first develop a good sample set of human-labeled images in order to “train” this sort of new algorithmic labeling program? But, oh, wait — Google has already had an Image Labeling program in beta release which invites users to come in and submit keywords to be associated with images through a sort of game. And, Vanessa Fox told us in the SES Images & Search panel discussion in Chicago last year, that some of their users really enjoy participating in the Image Labeler program. So, it’s not at all hard to connect the dots — if the automated Supervised Multiclass Labeling system were to be hooked up with the rich, trustworthy data developed through the Image Labeler program, Google would almost overnight have the ability to perform true image search based on images’ graphic content.
If that Image Labeler user-tagging method were associated with this new algorithmic method, the random users could become live “trainers” for the software. As time progresses, the software could become steadily more accurate and more efficient at adding appropriate words to be associated metadata for images.
Using this method, Google might only need a relatively small seed set of images to be tagged by humans in order to train their software to identify millions of other images. The end result would be a fantastically better Image Search, using more accurate data for associating users’ keyword search requests with images appropriate for the keyword. This sort of advantage would put them ahead of nearly all the other image search services out there.
Of course, there are other true image search services, but none of them have the user following of Google, I would bet.
A press release from UCSD on the research paper includes a nice video of UCSD professor Nuno Vasconcelos talking about the SML method if you’re interested.
(Nota bene: I’ll be speaking on increasing website traffic through optimization of images and for optimizing for Image Search at SES Conference next week. I’ve also previously blogged in the subject of image optimization.)
Possible Related Posts
Posted by Chris of Silvery on 04/05/2007
Permalink | | Print
| Trackback | Comments Off on New Research Could Improve Google Image Search | Comments RSS
Filed under: Google, Image Optimization, Research and Development, Searching Google, Google-Image-Labeler, Google-Image-Search, image-search, Supervised-Multiclass-Labeling
No comments for New Research Could Improve Google Image Search
No comments yet.
Sorry, the comment form is closed at this time.