Last week at SIGGRAPH, an international conference on computer graphics, a group presented an innovative system designed to analyze images of the sky. Most commercial image-search systems figure out what’s in an image by analyzing the associated text, such as the words surrounding a picture on a Web page or the tags provided by humans. But ideally, the software would analyze the content of the image itself. Much research has been done in this area, but so far no single system has solved the problem. The new system, called SkyFinder, could offer important insight into how to make an intuitive, automatic, scalable search tool for all images.
Jian Sun, who worked on SkyFinder and is the lead researcher for the Visual Computing Group at Microsoft Research, says that the traditional approach to image search sometimes leads to nonsensical results when a computer misinterprets the surrounding text. Typically, engines that analyze the content of images instead of text need a picture to guide the search–something submitted by the user that looks a lot like their intended result. Unfortunately, such an image may not be easy for the user to find. Sun says SkyFinder, in contrast, provides good results while also letting the user interact intuitively with the search engine.
To search for a specific kind of sky image, the user simply enters a request in fairly natural language, such as “a sky covered with black clouds, with the horizon at the very bottom.” SkyFinder will offer suggested images matching that description.
Each image is processed after it is added to the database. Using a popular method called “bag of words,” Sun explains, the image is broken into small patches, each of which is analyzed and assigned a codeword describing it visually. By analyzing the patterns of the codewords, the system classifies the image in categories such as “blue sky” or “sunset,” and determines the position of the sun and horizon. By doing this work offline, Sun says, the system can easily be scaled to search very large image databases. (The SkyFinder database currently contains half a million images.)
It’s also possible to fine-tune search terms using a visual interface. The system offers a screen, for example, where the user can adjust icons to show the desired positions of the sun and horizon. Those coordinates are added to the search.
SkyFinder arranges images logically on the screen–for example, from blue sky to cloudy sky, or from daytime to sunset. Once the user has found an image she likes, she can use it to guide a more targeted search to find similar images.
The system also includes tools to help a user replace the sky in one image with the sky from another picture.
“Computer graphics has had enormous successes in the past decades, but it is still impossible for an average computer user to synthesize an arbitrary image or video to their liking,” says James Hays, who was not involved with the research and has a PhD in computer science from Carnegie Mellon University. He believes it’s important to develop more-sophisticated tools for inexperienced users. Such people could use a tool like SkyFinder to find an image they want or to make adjustments to an existing image. Hays believes SkyFinder’s main contribution is its user interface.
Ritendra Datta, an engineer at Google who has studied machine learning and image search, says that allowing computers to understand automatically what’s being shown in an image remains one of the major open problems in image search. “SkyFinder seems to be an interesting new approach” that works for one type of image. Datta believes that advances in specialized applications could eventually be applied on a broader scale.
He thinks, however, that thorough usability studies are badly needed for search systems that rely on automatic analysis of images.
Sun plans to improve SkyFinder by adjusting it to analyze more attributes of the sky and by expanding the database. For now, he says, systems that automatically analyze images have to be trained completely differently depending on what type of image they’re working with. However, he says his work with SkyFinder could be used to identify pictures of the sky among a general bank of images.
Our best illustrations of 2022
Our artists’ thought-provoking, playful creations bring our stories to life, often saying more with an image than words ever could.
How CRISPR is making farmed animals bigger, stronger, and healthier
These gene-edited fish, pigs, and other animals could soon be on the menu.
The Download: the Saudi sci-fi megacity, and sleeping babies’ brains
10 Breakthrough Technologies 2023
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.