Travel App Can Recommend Places by Looking at Them
Software that counts dogs, martini glasses, and mustaches in Instagram photos provides a novel way to rate businesses.
Software that can understand the contents of images could provide valuable new data sources.
A travel app called Jetpac hopes to tackle two of the most pressing questions of our time: how can machines reliably extract information from images, and what exactly is the definition of a hipster?
Jetpac provides a consumer guide to local restaurants, bars, and coffee shops. But unlike competitors such as Yelp, it doesn’t rely on customers writing up reviews. Instead the company uses software to process public Instagram photos tagged with the business’s name and measures things like the number of smiles in the picture or amount of blue sky. Jetpac uses that information to help people searching for a tranquil coffee shop with outdoor seating or suitable venue for a social gathering.
“It’s like you stuck your head in the bar,” says Jetpac CTO Pete Warden. “Photos have a lot of signals in them.” Those include whether a bar is dog-friendly (which can be determined by counting pooches per picture) or high-class (by looking for clues such as martini glasses rather than beer cans).
Jetpac’s image analysis can also reveal things about specific Instagram users that guide its recommendations. Gastronomes tend to snap Instagram pictures of their groceries, so restaurants they frequent are likely to be foodie favorites. If the majority of an Instagram user’s photos are in Seattle and suddenly a few smiling pictures appear in Boston, Jetpac takes it as a signal that person is visiting a good tourist spot.
Jetpac does turn to humans to help its software with more qualitative measures, though. To inform the app’s “hipster finder,” which tries to point people to the coolest places in a city, Warden and his team used the crowdsourcing service Mechanical Turk. People were asked to label photos with key markers, like mustaches, plaid clothing, or chunky glasses, providing baseline data that allowed software to look for similar patterns in future photos to peg establishments with high hipster attendance.
Warden’s company uses software based on deep learning, an approach to training software loosely modeled on the brain and pioneered at Google (see “10 Breakthrough Technologies 2013: Deep Learning”). Jetpac’s algorithms are based largely on the research of deep learning expert and current Google employee Geoff Hinton (Google declined to make him available for this story). Jetpac has made the code for some of its deep learning software freely available, and released an iPhone app that can be trained to recognize objects using a device’s camera.
However, even using deep learning, images remain difficult for software to understand. Software can be very accurate at identifying a smile when there is a single face in a photo, says James Shanahan, vice president of data science for ad platform NativeX. But such systems fare less well with more complex images. “With three or more people, things get difficult,” Shanahan says.
Altogether, software can’t yet reliably understand everything in a single image, says Andrew Ng, chief scientist at Baidu, who previously worked on deep learning at Google. “It’s a difficult computer vision problem to look at a picture and determine the ‘mood’ of the scene,” he says.
Jetpac also has to work against the fact that Instagram images are often blurry, under- or overexposed, distorted by the service’s signature filters, and represent a carefully curated slice of reality. Social network enthusiasts tend to only share the good times. “Instagram is a lot more intentional,” Warden says. The younger demographic of Instagram users also means that more expensive restaurants are underrepresented in Jetpac’s data. And pictures with smiles are not necessarily indicative of the quality of a bar or restaurant, given that many people tend to smile for a camera anywhere after a few drinks. Warden notes the number of pictures with smiles spikes on Friday and Saturday evenings.
However, Warden says that combining results from multiple photos makes it possible to glean accurate enough information. Yelp reviews tend to focus on the mechanics of an establishment like service and food quality, says Warden, but looking at images allows Jetpac to get a sense of the experience of being there. “We’re not trying to do a scientific survey, but the more data we get, the better the picture we’re likely to get of what the place is actually like.”
Become an MIT Technology Review Insider for in-depth analysis and unparalleled perspective.Subscribe today