China’s masses of data give it an edge in AI—but they may not forever

Karen Haoarchive page

March 5, 2019

Last Thursday, MIT hosted a celebration for the new Stephen A. Schwarzman College of Computing, a $1 billion effort to create an interdisciplinary hub of AI research. During an onstage conversation between Schwarzman, the CEO and cofounder of investment firm Blackstone, and the Institute’s president,Rafael Reif, Schwarzman noted, as he has before, that his leading motivation for donating the first $350 million to the college was to give the US a competitive boost in the face of China’s coordinated national AI strategy.

That prompted a series of questions about the technological race between the countries. They essentially boiled down to this: When it comes to AI, more data is better, because it is a brute-force situation. How can the US outcompete China when the latter has far more people and the former cares more about data privacy? Is it, in other words, just a lost cause for the US to try to “win”?

Here was Reif’s response: “That is the state of the art today—that you need tons of data to teach a machine.” He added, “State of the art changes with research.”

Reif’s comments served as an important reminder about the nature of the AI: throughout its history, the state of the art has evolved quickly. We could very well be one breakthrough away from a day when the technology looks nothing like what it does now. In other words, data may not always be king. (See “We analyzed 16,625 papers to figure out where AI is headed next.”)

Indeed, within the last few years, several researchers have begun to pursue new techniques that require very little data. Josh Tenenbaum, a professor of brain and cognitive sciences at MIT, for instance, has been developing probabilistic learning models, inspired by the way children quickly generalize their knowledge from exposure to just a few examples.

Reif continued to explain his vision. “Studying how the brain learns, we created the state of the art today,” he said. “We can use that state of the art now to [further] learn how the brain learns.” Given that our brains themselves do not require a lot of data to learn, the better we come to understand its processes, the more closely we will be able to mimic it in new types of algorithms.

This story originally appeared in our AI newsletter The Algorithm. To have it directly delivered to your inbox, sign up here for free.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.