Researchers at the University of California, San Diego (UCSD), have developed a new image-search method that they claim outperforms existing approaches “by a significant margin” in terms of accuracy and efficiency. The researchers’ approach modifies a typical machine-learning method used to train computers to recognize images, says Nuno Vasconcelos, professor of electrical and computer engineering at UCSD. The result is a search engine that automatically labels pictures with the names of the objects in it, such as “radish,” “umbrella,” or “swimmer.” And because the approach uses words to label and classify parts of pictures, it lends itself nicely to typical keyword searches that people perform on the Web, says Vasconcelos.
Currently, searching for images on the Internet using keywords can be hit-or-miss. This is because most image-based searches use metadata–text, such as a file name, date, or other basic information associated with a picture–that can be incomplete, useless for keyword searches, or absent altogether. Computer scientists have been working on better ways to identify pictures and make them searchable for more than a decade, but getting machines to go beyond metadata and determine what objects are in a picture is a tough problem to solve, and most efforts to date have only been moderately successful.
While the UCSD research doesn’t completely solve the problem, it improves performance and efficiency for a certain approach, says Vasconcelos, and it identifies some “limitations in the way people were addressing the problem.”
The approach that the researchers tackled is called “content-based,” and it involves describing objects in a picture by analyzing features such as color, texture, and lines. These objects can be represented by sets of features and then compared with the sets extracted from other pictures. Feature sets are described by their statistics, and the computer searches for statistically likely matches.
The new research is based on this approach, but it adds an intermediate step, says Pedro Moreno, a Google research engineer who worked on the project. Moreno explains that this new step provides a “semantic label,” or a word tag that describes objects in pictures instead of relying solely on sets of numbers.
For instance, consider submitting an image of a dog on a lawn. The objects in the pictures are analyzed and compared with results for known categories of objects, such as dogs, cats, or fish. Then the computer provides a statistical analysis that gives the likelihood that a picture matches those categories. The system might score the picture with a 60 percent probability that the main object is a dog and a 20 percent probability that it is a cat or a fish. Thus, the computer deems that, in all likelihood, the picture contains an image of a dog. “The key idea is to represent images in this semantic space,” Moreno says. “This seems to improve performance significantly.”