Problem: Images are hard for search engines to index because computers find it difficult to identify their content. Algorithms called classifiers can sort images using statistical techniques, but that presents something of a chicken-and-egg problem: ideally, “you need millions of [classified] images to train a classifier,” says Jian Sun, a researcher at Microsoft Research Asia in Beijing.
Solution: Sun developed a way to make it easy for humans to train computers in picture classification. With his system, which was recently incorporated into Microsoft’s Bing Images search engine, users enter a search term–say, “cloudy sky.” Using its existing classification algorithm, Bing makes its best attempt to present a grid of images that match the search term. The user can click on a nearly right image and ask to see similar pictures, repeating the process until the perfect image appears. As the user refines the search, each click is fed back into the classifier. This means the next time a user searches for “cloudy sky,” Bing will immediately present a more relevant set of images than before. The system is also being used to help other researchers develop image search algorithms; incorporating results from Bing, Sun has released a training database containing 100,000 categorized images. –David Cohen
Learning machine: Bing lets users refine search results (top), producing images that better match a search term (middle). New searches will then produce better initial results (bottom).