Playtime’s Over

Getting computers to beat humans at games is impressive. But now the real work begins.

Emma Brunskillarchive page

February 22, 2017

Early last year, a computer achieved world-class performance in the game Go—years before most people believed such a feat would be possible.

That’s impressive, but our ambitions should be set higher. Computer science could help provide what the world critically needs: tools that enable all of us to reach beyond what we thought we were capable of. Reinforcement learning—an integral part of the Go success—can accelerate that process (see “10 Breakthrough Technologies: Reinforcement Learning”).

Reinforcement learning is a way of making a computer learn through experience to make a series of decisions that yield positive outcomes—even without any prior knowledge of how its actions will affect its immediate environment. A software-based tutor, for example, would alter its activities in response to how students perform on tests after using it.

If we hope to create artificial teaching agents using reinforcement learning, we’ll need algorithms that are “data smart.” We might gather data from online educational systems and use it to help the agent estimate the effectiveness of different teaching approaches. When a student logs in, should the system provide him with a problem to solve? Or would starting with an explanatory video be better? The data can help it decide.

But in some cases there’s not enough data, or not the right kind of data, which makes it challenging to develop systems that make good decisions. It would be nice if we could create a system that didn’t need so much data in the first place. And that’s exactly what my group is working on—we’re developing reinforcement-learning algorithms and statistical techniques to allow computers to develop good suggestions while using less data. We still have a lot of work to do, but we’re tightening the gap between theory and practice.

In the end, we shouldn’t leave it all to the computers. So-called “human-in-the-loop” reinforcement learning can accelerate the process, allowing algorithms to “reason” about their own limited performance and reach out to humans for help when they need, for example, to expand the set of possible decisions. My group and our collaborators at the University of Washington are now testing algorithms for a tutoring system that can tell if its current curriculum isn’t enabling all students to learn well, and then asks people to add new hints to the system. Such human-computer collaborations could help students to learn using approaches we can’t yet imagine. This vision of reinforcement learning has artificially intelligent agents redefining what outstanding human performance looks like—and enabling all of us to achieve it.

Emma Brunskill is an assistant professor of computer science at Stanford University.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.