Visual Search for Better Online Shopping

A new website lets people search for hard-to-describe items by using pictures instead of words.

Kate Greenearchive page

November 8, 2006

Here’s the scenario: You pass a person on the sidewalk wearing a pair of stylish shoes. The leather is light brown, with a rounded toe and a buckle. You’d like to find a similar pair for yourself online. But searching for “shoes, light brown, rounded toe, buckle” probably won’t get you very far.

Launched today, Like.com offers a new method of searching–using pictures instead of text–that may provide a better way to shop. The visual search engine uses a picture as a starting point, and it crawls the webpages of more than 200 online stores, including Amazon.com and L.L. Bean, searching for pictures of items similar to the one you’re interested in. Currently, Like.com looks at more than two million different items in four categories: shoes, handbags, watches, and jewelry. In the next few months, the company hopes to add shirts, pants, and dresses.

“We realized that the place visual search could add the most value is the place where it’s hard to describe an item with words–where you’d want to submit a photo rather than enter text,” says Munjal Shah, creator of Like.com. Shah is also the CEO of the photo-sharing website Riya.com, a site that recognizes faces in submitted photos (see “Face Recognition Software Goes Public”).

Like.com works by using an image as a springboard for the search. Users can base their search on photos from 200 online retailers, and they can select accessories from celebrity photos in the Like.com database. Users can also indicate which characteristics, such as color, shape, or pattern, are most important to them. In addition, they can use traditional text filters to sort by brand, style, and price.

Special software developed by Like.com’s team of computer scientists recognizes similar objects by deconstructing pictures of them. Each image is broken down into 10,000 numbers that represent more than 30 features of the item–for example, the full spectrum of colors that appear in a handbag, its lumps and curves, and the glossiness of its exterior. Additionally, a user can highlight a particular feature of the item that he or she likes the most–for instance, the strap of the watch or the shape of its face–and search within that constraint. The 10,000 numbers that describe the original picture are compared with the numbers that describe the pictures on merchants’ websites.

Developing the visual search system was tricky, says Shah. He and his team had to spend a lot of time making sure that their crawler could access the high-resolution version of an image on merchants’ sites (fewer pixels don’t provide as much useful information to compare). And, if a merchant’s website offered multiple views and colors, the Web crawler needed to be able to access those as well. Like.com works best with watches and handbags, says Shah, simply because they tend to photograph consistently and there is little glare. Jewelry is more challenging for the search engine to match due to the variation in the way shiny gold and glistening diamonds are lit in photos.

The idea of visual search is certainly not new, says Pawan Sinha, professor of brain and cognitive science at MIT. “Ever since the Web came into being, there has been a large amount of graphical information available,” he says, “and that makes visual search seem like a very attractive idea.” But visual search hasn’t panned out, in part because it’s difficult for a computer to extrapolate context from a photo. For instance, a computer may or may not classify a picture of soldiers raising a flag at Iwo Jima as a World War II event.

Narrowing down the scope of the project to clothing and accessories, Sinha says, helps make the problem more manageable. Still, “it’s a fairly difficult challenge,” he says.

“I think it’s a great idea,” says Sucharita Mulpuru, a senior analyst at Forrester Research. “But I think the big question is how well the algorithm really works–whether or not the product you look for really yields similar results.” She adds that the four categories that Like.com features now are “just scratching the surface.” She thinks the concept could have exciting applications beyond clothing and accessories: it could be used to find furniture, rugs, and wallpaper.

Like.com is a work in progress; it will be tweaked as Shah and his team learn more about how people are using the tool and what they want, he says. And there are still algorithmically challenging aspects of adding shirts to the mix. Shah explains that shirts are usually pictured one of two different ways: either on mannequins or on people, or else lying flat. For computer vision algorithms, it’s difficult to reconcile the two different versions of a shirt. This is a problem that the Like.com team is expected to work out in a couple of months, says Shah.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.