It’s telling that the most interesting presenter during MIT Technology Review’s EmTech session on big data last week was not really about big data at all. It was about Amazon’s Mechanical Turk, and the experiments it makes possible.
Like many other researchers, sociologist and Microsoft researcher Duncan Watts performs experiments using Mechanical Turk, an online marketplace that allows users to pay others to complete tasks. Used largely to fill in gaps in applications where human intelligence is required, social scientists are increasingly turning to the platform to test their hypotheses.
The point Watts made at EmTech was that, from his perspective, the data revolution has less to do with the amount of data available and more to do with the newly lowered cost of running online experiments.
Compare that to Facebook data scientists Eytan Bakshy and Andrew Fiore, who presented right before Watts. Facebook, of course, generates a massive amount of data, and the two spoke of the experiments they perform to inform the design of its products.
But what might have looked like two competing visions for the future of data and hypothesis testing are really two sides of the big data coin. That’s because data on its own isn’t enough. Even the kind of experiment Bakshy and Fiore discussed—essentially an elaborate A/B test—has its limits.
This is a point political forecaster and author Nate Silver discusses in his recent book The Signal and the Noise. After discussing economic forecasters who simply gather as much data as possible and then make inferences without respect for theory, he writes:
Don’t settle for half the story.
Get paywall-free access to technology news for the here and now.