The Chinese Solar Machine Layer by Layer Fire in the Library The Mystery Behind Anesthesia
(Page 2 of 2)
The researchers' system gained its expertise by being exposed to thousands of pictures that included objects such as mountains, flowers, people, water, and tigers, as well as the semantic tags that corresponded to the objects. Then the researchers tested how well the system performed by exposing it to new pictures that included objects that weren't yet labeled. When compared with a human's description of a scene, the system did well: a picture of a tiger in tall grass prompted the system to find "cat," "tiger," "plants," "leaf," and "grass." A human-made caption included "cat," "tiger," "forest," and "grass." And when the researchers compared their system's tags with more typical content-based approaches, they found that it did better by about 40 percent. In other words, it produced fewer words that were not applicable to the image.
Larry Zitnick, an image-search researcher at Microsoft, says that the research is pushing the limits of content-based search to see how well it can work. "What they're doing is analyzing how far we can go based on [searching an image for objects], and that's really good as far as pushing the envelope." He also suspects that the approach could work well for large sets of images, such as those on the Internet.
Zitnick adds that the UCSD results could be great for certain types of simple object searches in pictures. However, it would not work for other searches, such as distinguishing the U.S. capitol building from the state capitol building in Lincoln, NE. "Visual problems are very difficult, and I don't think any one solution is going to solve everything," Zitnick says.
However, the researchers' approach could be useful if folded into existing search software, says Chuck Rosenberg, a Google software engineer who works on image search. If incorporated into desktop search, the approach could allow people to search for images based on the similarity of appearance. But it wouldn't necessarily help people find pictures based on more obscure concepts such as happiness. "For example," Rosenberg says, "I might want a picture of a happy family out for an evening walk to put on a card I'm making. For a computer to truly find that picture based on the content of the image alone ... is beyond current technology."
Vasconcelos of UCSD suspects that it will be more than five years before computers are able to identify more-difficult concepts, such as happiness, in pictures. But that doesn't mean current research won't be useful before then, he says. "The expectation has to be that [the technology] is more like an aid, not like an answer."
Currently available opensource image search technology
Content-based image search (query-by-example) is currently available for anyone with an image-related website or software to try at the recently open-sourced isk-daemon project.
Manufacturing in the United States is in trouble. That's bad news not just for the country's economy but for the future of innovation.
Our list of the 50 most innovative companies, including the following:
riecksd
1 Comment
It's Important to Distinguish between Informal Metadata and "Embedded Metadata"
When Greene asserts that "This is because most image-based searches use metadata--text, such as a file name, date, or other basic information associated with a picture--that can be incomplete, useless for keyword searches, or absent altogether." She is , in my mind, missing one important distinction.
That being that it's important to differentiate between "metadata" of the type that happens to be in the associated text, caption and, page title within an HTML page where the image resides; and the types of "embedded metadata" that the creator or distributor might add to an image using such standard metadata schemas as that provided by the International Press Telecommunications Council (IPTC) and made popular in the File Info feature of imaging programs such as Adobe Photoshop.
Comments such as Greene's could lead readers to assume that the IPTC and EXIF (the latter automatically added by digital cameras) metadata in an image is being used by the search engines, when we've seen no indication that this is the case.
Many of the images that are on the internet that have been taken by professionals may have IPTC embedded metadata, but this is virtually ignored at present. The technology to write this type of information has been around for several decades. In addition since the inception of XMP style metadata in 2001, technology has existed to read this information directly from images on the web.
See http://www.dphoto.us/convert for one example.
Having systems that can automatically determine subject matter in an image is great news, but if coupled with technology that would probe the image for any existing embedded metadata you could enhance the results way beyond simply using the associated text on the page.
One caveat. Not all images have this embedded metadata... even if it was in the original file. This is due to the fact that a number of applications that prepare images for the web routinely remove metadata (EXIF, IPTC, ICC color profiles) in the name of compactness. This practice is a dangerous one as it also removes ownership metadata and in todays' society creates potential "orphan works" that may be abused. See the Metadata Manifesto http://MetadataManifesto.blogspot.com/ for a whitepaper that discusses this in detail.
In the interim, I look forward to seeing the results of this subject recognition technology.
David Riecks
http://www.ControlledVocabulary.com/
Reply