The Emerging Science of Worker Productivity

Experiments with Amazon’s Mechanical Turk are teasing apart the factors that determine worker productivity

Emerging Technology from the arXivarchive page

August 19, 2010

There’s a puzzle at the heart of our economy that has troubled economists for decades. The question is this: why do people work hard in environments where they are poorly monitored and paid a fixed wage, rather than a performance-related one.Surely any rational worker would do the bare minimum to get by.

One line of thinking focuses on the relationship between the workers and their employer, which can be influenced by contracts set out in writing and by personal relationships between workers and their managers.

That suggests that one way for an employer to improve productivity would be to perfect its employment contracts.

Another line of thinking is that peer pressure plays an important role. The people around you may affect the way you work. For example, good workers, leading by example, might raise the quality of everybody’s work. On the other hand, bad apples may make the good ones rotten.

But working out which of these effects wins out is hard. Peer pressure is hard to quantify and the various results in this area are somewhat contradictory, suggesting that they may depend on the environment too.

But a new tool is emerging that can help, according to John Horton at Harvard University who says the recent development of online marketplaces, in which people can buy and sell services over the web, provides a fascinating laboratory in which to test these ideas.

Today he publishes the results of a set of experiments that reveal some of the ways in which peer pressure may influence productivity.

Horton’s laboratory of choice is Amazon’s Mechanical Turk service in which workers sign up to do simple repetitive tasks for a few pennies per pop. Mechanical Turk attracts workers from all over the world and provides employers with a large on-demand workforce that is available 24 hours a day.

The tasks Horton set for his workers (100 of them in each experiment) is to label a picture, for example of a breakfast table, with keywords such as fruit juice, toast, yoghurt etc and to evaluate pictures that have already been labelled by other workers.

In the first experiment he showed some workers a picture with many labels while showing others a picture with few labels. He then asked both groups to label another picture which was the same for everyone. Unsurprisingly, the workers shown more labels produced more labels themselves, probably because the example picture set expectations in the workers’ minds of what the employer expected.

In another experiment, workers were shown a picture with few labels, asked to completed an image-labelling exercise and then asked to evaluate the work of another worker, which may have many or few labels. Workers knew that if they didn’t approve a picture, the other worker would not get paid. This provided a measure of their willingness to punish.

What Horton found is interesting. For a start, workers who have previously produced few labels were less likely to punish others. But the effect is complex. Workers seemed willing to punish others who they perceived to have produced too few labels but not these who they perceived to have produced too many labels, where only a few were required. So workers will punish others for low productivity but not for high productivity that is not needed.

That may have important implications for management techniques that ask workers to change their own patterns of work, says Horton. “For example, it may be difficult to get workers to substitute

easy, correct procedures for difficult, inefficient procedures,” he says. “Ironically, the difficulty itself might make an outdated procedure harder to replace, as workers who adopt the easier method might be perceived to be shirking.”

In another experiment, Horton began with the labelling exercise, then moved to an evaluation exercise and finally asked workers to complete another labelling exercise. “On average, workers that evaluated highly productive work produced more labels in the follow-on image-labeling task than workers that evaluated less productive work,” says Horton. But workers exposed to images with few labels, later produce fewer labels themselves

That leads to the possibility of destructive vicious circle in worker behaviour, says Horton. “The finding that exposure to low-output work lowers output, combined with the finding that low-productivity reduces willingness to punish, suggests the possibility of an organizational vicious cycle: after observing idiosyncratically bad work, workers may lower their own output and punish less in response, in turn reducing other workers’ incentives to be highly productive.”

And this, says Horton, may explain why leaders often use the language of contagion to describe morale and why management theory focuses on understanding and influencing culture within an organisation rather than trying to write perfect employment contracts.

Horton’s work raises many questions, not least because it contradicts other work suggesting that it is possible to improve poor workers’ output by pairing them with good workers. By contrast, Horton found that “the bad apples ruined the good apples, and the good apples did nothing for the bad.”

This kind of work fascinates psychologists, economists and managers because it raises the possibility that productivity in the workplace can be manipulated by clever management rather than by expensive financial incentives.

And sure enough, the key result in Horton’s work is that worker productivity is easily pliable The big question is this: if colleagues affect each other’s work, should this influence be encouraged or discouraged in the workplace.

Horton’s answer is that it depends; but on exactly what, he has yet to nail down. Clearly, there are interesting times ahead for workers on the Mechanical Turk.

Ref: arxiv.org/abs/1008.2437: Employer Expectations, Peer Effects and Productivity: Evidence from a Series of Field Experiments

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.