It’s telling that the most interesting presenter during MIT Technology Review’s EmTech session on big data last week was not really about big data at all. It was about Amazon’s Mechanical Turk, and the experiments it makes possible.
Like many other researchers, sociologist and Microsoft researcher Duncan Watts performs experiments using Mechanical Turk, an online marketplace that allows users to pay others to complete tasks. Used largely to fill in gaps in applications where human intelligence is required, social scientists are increasingly turning to the platform to test their hypotheses.
The point Watts made at EmTech was that, from his perspective, the data revolution has less to do with the amount of data available and more to do with the newly lowered cost of running online experiments.
Compare that to Facebook data scientists Eytan Bakshy and Andrew Fiore, who presented right before Watts. Facebook, of course, generates a massive amount of data, and the two spoke of the experiments they perform to inform the design of its products.
But what might have looked like two competing visions for the future of data and hypothesis testing are really two sides of the big data coin. That’s because data on its own isn’t enough. Even the kind of experiment Bakshy and Fiore discussed—essentially an elaborate A/B test—has its limits.
This is a point political forecaster and author Nate Silver discusses in his recent book The Signal and the Noise. After discussing economic forecasters who simply gather as much data as possible and then make inferences without respect for theory, he writes:
This kind of statement is becoming more common in the age of Big Data. Who needs theory when you have so much information? But this is categorically the wrong attitude to take toward forecasting, especially in a field like economics, where the data is so noisy. Statistical inferences are much stronger when backed up by theory or at least some deeper thinking about their root causes.
Bakshy and Fiore no doubt understand this, as they cited plenty of theory in their presentation. But Silver’s point is an important one. Data on its own won’t spit out answers; theory needs to progress as well. That’s where Watts’s work comes in.
The Internet is transforming how researchers think of the “lab” and enabling new kinds of experiments across the social sciences. Those experiments will be critical in helping us collectively make sense of the huge amounts of data we’re now generating. And those huge data sets will help inform the direction of Watts’s and others’ experiments.
The value of big data isn’t simply in the answers it provides, but rather in the questions it suggests that we ask.
This artist is dominating AI-generated art. And he’s not happy about it.
Greg Rutkowski is a more popular prompt than Picasso.
VR is as good as psychedelics at helping people reach transcendence
On key metrics, a VR experience elicited a response indistinguishable from subjects who took medium doses of LSD or magic mushrooms.
This nanoparticle could be the key to a universal covid vaccine
Ending the covid pandemic might well require a vaccine that protects against any new strains. Researchers may have found a strategy that will work.
How do strong muscles keep your brain healthy?
There’s a robust molecular language being spoken between your muscles and your brain.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.