Pennies for Web Jobs

Amazon wants to employ people to do menial Web tasks that computers can’t handle.

Sam Williamsarchive page

March 9, 2006

Speaking to a room filled with Internet developers at the O’Reilly Emerging Technology Conference in San Diego this week, Luis Felipe Cabrera, Amazon’s vice president of software development, outlined a project to harness human intelligence for tasks that computers can’t handle well, such as recognizing objects in images.

The backbone of the plan is a Web-services platform called Mechanical Turk. It uses an auction-style system to farm out complex tasks – complex for a computer, that is – such as recognizing the difference between a human face and a nearby bush, or accurately transcribing an audio recording. Cabrera likes to call the platform “artificial artificial intelligence” – it’s computers asking humans to do tasks, rather than the other way around.

To illustrate the idea, Cabrera cited a test in which A9.com, Amazon’s search engine, asked average users to fulfill “human intelligence tasks” (HITs) – jobs that computers are notoriously bad at doing, such as picking out one building or business within a photograph of a city block in order to highlight that part of the image in association with a business address.

Not only did participants supply the necessary answers, but they did so “outstandingly fast,” according to Cabrera, allowing Amazon to use the photographs in its search results. “This is the tip of the iceberg, but you can see how it enables ‘massively parallel’ human computing,” he said.

Of course, there’s a keen irony in all this. At a conference-cum-show dedicated to technology-based solutions, Amazon’s Mechanical Turk is an allusion to the famous exhibit in the 1760s, by Hungarian showman Wolfgang von Kemepelen, in which a chess-playing automaton, known as The Mechanical Turk, was dressed like a Turkish pasha. It wowed royal audiences – and even won a few notable chess battles. And it was a complete fake. Von Kemepelen, a century before P.T. Barnum, had simply hidden an undersized chess master within the machine, along with pulleys, gears, and other faux-mechanical props.

While Amazon’s use of the name might suggest a betrayal of the concept of artificial intelligence (AI), it’s actually the latest in string of experiments dealing with the complementary nature of machine and human intelligence.

Two of the best-known AI applications are Google’s PageRank algorithm, which counts each human-initiated inbound link to a site as a “vote” for that site’s content quality, and Amazon’s recommendation system, which uses algorithms to seek out patterns in customer purchase data to market books and other products to customers whose purchase decisions fit the same pattern.

A more recent example is exemplified by sites like Flickr and del.icio.us, which use human-supplied keywords, or tags, to summarize complex information, such as the thematic content of a photographic image or the functional purpose of a website.

But, unlike this methodology, which depends on users generating information without any compensation, Amazon’s Mechanical Turk will pony up cash to anyone willing to complete the tasks they want to farm out. Most of the tasks ask people to do little more than, say, match the address and owner in a real-estate title listing or indicate that a photograph is a man or woman – and they earn mere pennies for each one.

Such rates put even the most enthusiastic participants on a par with workers in developing economies. Today, many Mechanical Turk users are students or housewives – but who’s to say participants in the future won’t be Vietnamese workers looking to earn a few hundred dong on the side?

“We are gathering demographics now on the provider base, and it seems that a lot of the people that are frequenting the [Mechanical] Turk are work-at-home moms, students, and foreigners,” says David Pfeiffer of DPA Software, a four-person “virtual” company in Waukesha, WI, that’s designing interface tools to make creating such applications feasible for the nontechnical user. “It seems like we’re heading toward a workforce that may not have a singular expertise, but just a general human response,” Pfeiffer says.

With time, however, Pfeiffer sees [Amazon’s] platform as a qualification service for project managers seeking out talented minds. Unlike rival services, such as Google Wizard, which lets responders work their way up through a customer satisfaction system, Amazon’s Mechanical Turk gives project managers the freedom to make their own assessments based on response latency and response quality. For that reason, the next upgrade of Pfeiffer’s main product, HitBuilder, will include grading and ranking features, so that companies can have a better sense of who is supplying the best information.

“Right now [requesters] are having to lowball their hits, because the talent variation is so high; they have to get three or four or five people just to make sure they get a good answer,” Pfeiffer says. “Once you work qualification into the process, you make it easier to boost quality. That’s a software value we’re looking to add.”

Chris Law, founder and vice president of Aggregate Knowledge, a San Francisco company looking to help small online companies exploit the Amazon-style recommendation process “within a day,” are gushing over the concept.

“I think Amazon really got it right,” Law says. “One thing I hate is doing the repetitive stuff…I’ll pay somebody to do it for me. I see [Amazon’s Mechanical Turk] ending up being a reverse eBay: you want to get something done, so you say, ‘Here’s my price.’”

Caption for home-page image: A picture of The Mechanical Turk, an 18th-century device that could supposedly play chess, but which actually had a person hidden inside it.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.