Skip to Content

Three Questions with the Man Leading Baidu’s New AI Effort

The Chinese Web giant Baidu is researching ways to build artificial neural networks that learn without guidance from humans. We spoke to the man leading that effort.

Artificial intelligence is guided by the far-off goal of having software match humans at important tasks. After seeing results from a new field called deep learning, which involves processing large quantities of data using simulated networks of millions of interconnected neurons, some experts have come to believe that this goal isn’t so distant after all (see “Deep Learning” and “Facebook Creates Software That Matches Faces Almost as Well as You Do”).

Last week, Baidu, China’s largest Web search company, joined U.S. tech giants betting big on deep learning by opening a new Silicon Valley lab dedicated to the approach (see “Chinese Search Giant Baidu Hires Man Behind the ‘Google Brain’”). Adam Coates, who leads research at the new lab, spoke with MIT Technology Review’s Tom Simonite about how deep learning might bring software closer to human performance at some tasks.

The “Google Brain” experiment where a large neural network learned to recognize cats and other objects just by looking at photos from YouTube is often held up as a key proof of the power of deep learning (see “Self-Taught Software”). What makes that project so important?

The thing that’s neat about the Google result is that no one has to tell it what an object is. We have so much evidence from neuroscience that this is a crucial way to learn about how the world works. But it’s also an engineering imperative. I cannot program enough rules into the computer for it to understand the world; now we can try to have them learn the rules themselves.

Google’s system fell short of human performance, at best detecting human faces only 81 percent of the time. The more established approach of “supervised” learning, where software is given hand-labeled data to learn from, can do better. Do we know how to get unsupervised, or self-taught, systems to improve?

How to make it pay off at the level we want—[achieving] human level performance—is very challenging.

If you give me a lot of examples of what you want to predict then I can train software to get that right. The challenge is how to succeed when you don’t have a lot of examples. Human beings do not have to see a million cats to understand what one is. We might use a combination of supervised and unsupervised learning. Understanding how to blend those two ideas together is going to be crucial.

At Stanford—inspired by the Google Brain experiment—you developed an even larger neural network. Will bigger “brains” automatically be smarter?

For the scale of the challenge we’re looking at—human level performance—it’s very clear that for a small neural network there’s no hope. State-of-the-art ones have hundreds of millions of connections. You can do a lot with that; recognize a lot of objects, for example.

[But] it does not seem to be as simple as just making the neural network much bigger. The Google Brain result was built on a huge distributed system with a lot of CPU cores [16,000]. We found that if you put a lot of GPUs [specialized graphics processors] together we could make a much bigger neural network—10 billion nodes, with 16 machines instead of 1,000.

We used that same benchmark [images from YouTube videos] that the Google team did. But even though we could train a much larger neural net, we didn’t necessarily get a better cat detector. Right now we can run neural networks that are larger than we know what to do with.

[At the Baidu lab] we want to build a framework to run big enough experiments to test out all variations in algorithms that might universally improve performance.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.