Technology Review - Published By MIT
Advertisement

Better, More Accurate Image Search

By modifying a common type of machine-learning technique, researchers have found a better way to identify pictures.

By Kate Greene

Monday, April 09, 2007

smaller text tool iconmedium text tool iconlarger text tool icon

Researchers at the University of California, San Diego (UCSD), have developed a new image-search method that they claim outperforms existing approaches "by a significant margin" in terms of accuracy and efficiency. The researchers' approach modifies a typical machine-learning method used to train computers to recognize images, says Nuno Vasconcelos, professor of electrical and computer engineering at UCSD. The result is a search engine that automatically labels pictures with the names of the objects in it, such as "radish," "umbrella," or "swimmer." And because the approach uses words to label and classify parts of pictures, it lends itself nicely to typical keyword searches that people perform on the Web, says Vasconcelos.

Finding photos: A new algorithm developed at UCSD that adds word tags to images can increase image-search accuracy and efficiency. Above, features from a picture are assigned a likelihood that they belong in certain categories, such as “water” or “person.”
Credit: Gustavo Carneiro, Antoni B. Chan, Pedro J. Moreno, and Nuno Vasconcelos
Multimedia
•  Image Search

Currently, searching for images on the Internet using keywords can be hit-or-miss. This is because most image-based searches use metadata--text, such as a file name, date, or other basic information associated with a picture--that can be incomplete, useless for keyword searches, or absent altogether. Computer scientists have been working on better ways to identify pictures and make them searchable for more than a decade, but getting machines to go beyond metadata and determine what objects are in a picture is a tough problem to solve, and most efforts to date have only been moderately successful.

While the UCSD research doesn't completely solve the problem, it improves performance and efficiency for a certain approach, says Vasconcelos, and it identifies some "limitations in the way people were addressing the problem."

The approach that the researchers tackled is called "content-based," and it involves describing objects in a picture by analyzing features such as color, texture, and lines. These objects can be represented by sets of features and then compared with the sets extracted from other pictures. Feature sets are described by their statistics, and the computer searches for statistically likely matches.

Story continues below

The new research is based on this approach, but it adds an intermediate step, says Pedro Moreno, a Google research engineer who worked on the project. Moreno explains that this new step provides a "semantic label," or a word tag that describes objects in pictures instead of relying solely on sets of numbers.

For instance, consider submitting an image of a dog on a lawn. The objects in the pictures are analyzed and compared with results for known categories of objects, such as dogs, cats, or fish. Then the computer provides a statistical analysis that gives the likelihood that a picture matches those categories. The system might score the picture with a 60 percent probability that the main object is a dog and a 20 percent probability that it is a cat or a fish. Thus, the computer deems that, in all likelihood, the picture contains an image of a dog. "The key idea is to represent images in this semantic space," Moreno says. "This seems to improve performance significantly."

Comments

  • It's Important to Distinguish between Informal Metadata and "Embedded Metadata"
    When Greene asserts that "This is because most image-based searches use metadata--text, such as a file name, date, or other basic information associated with a picture--that can be incomplete, useless for keyword searches, or absent altogether." She is , in my mind, missing one important distinction.

    That being that it's important to differentiate between "metadata" of the type that happens to be in the associated text, caption and, page title within an HTML page where the image resides; and the types of "embedded metadata" that the creator or distributor might add to an image using such standard metadata schemas as that provided by the International Press Telecommunications Council (IPTC) and made popular in the File Info feature of imaging programs such as Adobe Photoshop.

    Comments such as Greene's could lead readers to assume that the IPTC and EXIF (the latter automatically added by digital cameras) metadata in an image is being used by the search engines, when we've seen no indication that this is the case.

    Many of the images that are on the internet that have been taken by professionals may have IPTC embedded metadata, but this is virtually ignored at present. The technology to write this type of information has been around for several decades. In addition since the inception of XMP style metadata in 2001, technology has existed to read this information directly from images on the web.

    See http://www.dphoto.us/convert for one example.

    Having systems that can automatically determine subject matter in an image is great news, but if coupled with technology that would probe the image for any existing embedded metadata you could enhance the results way beyond simply using the associated text on the page.

    One caveat. Not all images have this embedded metadata... even if it was in the original file. This is due to the fact that a number of applications that prepare images for the web routinely remove metadata (EXIF, IPTC, ICC color profiles) in the name of compactness. This practice is a dangerous one as it also removes ownership metadata and in todays' society creates potential "orphan works" that may be abused. See the Metadata Manifesto http://MetadataManifesto.blogspot.com/ for a whitepaper that discusses this in detail.

    In the interim, I look forward to seeing the results of this subject recognition technology.

    David Riecks
    http://www.ControlledVocabulary.com/
    Rate this comment: 12345

    riecksd
    04/11/2007
    Posts:1
    Avg Rating:
    5/5
  • Currently available opensource image search technology
    Content-based image search (query-by-example) is currently available for anyone with an image-related website or software to try at the recently open-sourced isk-daemon project.
    Rate this comment: 12345

    rnc000
    09/26/2007
    Posts:1

Log In

Forgot your password?     Register »
Advertisement

Videos

Laser-Triggered Chemical Reactions
Featured Content
Sponsored by:
White Papers

Twelve ways to reduce costs with SQL Server 2008
Find out how to reduce costs and get more efficient

Download

Total Economic Impact of SQL Server 2008 Upgrade
Forrester reports on increasing productivity and management capabilities

Download 

Achieving Cost and Resource Savings with UC
How Office Communications Server R2 and Exchange Server can make your business smarter and more efficient

Download 

The Compelling Case for Conferencing
Read how you can improve workload support and find IT efficiencies

Download

How Windows Server 2008 R2 Helps Optimize IT and Save you Money
Read how you can improve workload support and find IT efficiencies

Download

Windows Server 2008 R2 Hyper-V Live Migration
See how Windows Server 2008 R2 and Hyper-V enable virtualization and Live Migration

Download
Advertisement
Subscribe to Technology Review's daily e-mail update. Enter your e-mail address

TECHNOLOGY RESOURCES
Advertisement
MIT Massachusetts Institute of Technology © 2009 Technology Review. All Rights Reserved.