Intelligent Machines

Elon Musk’s OpenAI Unveils a Simpler Way for Machines to Learn

The group says it has a more practical way to get software to learn tasks, such as steering robots, that require multiple actions.

Watch OpenAI’s research director, Ilya Sutskever on stage at EmTech Digital.

In 2013 a British artificial-intelligence startup called DeepMind surprised computer scientists by showing off software that could learn to play classic Atari games better than an expert human player. DeepMind was soon acquired by Google, and the technique that beat the Atari games, reinforcement learning, has become a hot topic in the field of AI and robotics. Google used reinforcement learning to create software that beat a champion Go player last year.

Now OpenAI, a nonprofit research institute cofounded and funded by Elon Musk, says it has discovered that an easier-to-use alternative to reinforcement learning can get rival results when it plays games and performs other tasks. At MIT Technology Review’s EmTech Digital conference in San Francisco on Monday, OpenAI’s research director, Ilya Sutskever, said that could allow researchers to make progress in machine learning faster.

“It’s competitive with today’s reinforcement-learning algorithms on standard benchmarks,” said Sutskever. “It is surprising that something so simple actually works.” 

Machine-learning software from OpenAI figured out how to play classic Atari games.

Sutskever argues that finding new ways to have software learn to do things like play computer games or steer robots is important to making machine-learning software take on more complex tasks than just recognizing images or transcribing our speech. “If we have computer systems learn to take complicated actions in the world, then I think we would be comfortable calling them intelligent,” he said.

Sutskever and colleagues tested their approach, called evolution strategies, by building software that learned to play more than 50 Atari games, including Pong and Centipede. Because it is easier to scale up the new method across multiple processors, in one hour they could train artificial players comparable to those that took a day to produce using a reinforcement-learning system published by Google DeepMind last year. It showed the same ability to learn things like the need to surface for air in the game Seaquest (middle frame in the animation).

OpenAI’s research director, Ilya Sutskever

Evolution strategies showed a similar advantage when used to take on a standard test from robotics in which software has to figure out how to make a humanoid walk in a simulated environment. It took 10 minutes to achieve results that a state-of-the-art reinforcement-learning system would need about 10 hours to attain, the researchers say.

The technique is a reboot of a decades-old idea about how to get learning software to try out different actions and identify the most effective ones. It is loosely inspired by how natural selection causes biological organisms to adapt to their environments.

“An algorithm everybody has known about for a long time works better than most people thought,” said Sutskever.

He declined to suggest specific applications of AI that might get a boost from the evolution strategies technique, saying more research is needed on its strengths and limitations. But Sutskever said that comparing the method with reinforcement learning suggested it would be better at learning to perform more complex tasks that require more steps to get a result.

For that reason, Sutskever said, he believes evolution strategies will help OpenAI’s goal of creating what he calls artificial general intelligence—software that can adapt to many kinds of complex scenarios.

Most researchers in machine learning don’t talk much about general intelligence, instead pursuing progress on specific, often narrowly focused problems. OpenAI’s mission statement includes a commitment to creating artificial general intelligence. Sutskever said the pace of progress in machine learning means that goal is worth thinking about now.

“[It] seems far off right now but [was] way more far off five years ago,” he said. “The number of people and the amount of effort going into developing these algorithms is extremely high—things are moving forward at a very healthy pace.”

Want to go ad free? No ad blockers needed.

Become an Insider
Already an Insider? Log in.
OpenAI’s research director, Ilya Sutskever

Uh oh–you've read all of your free articles for this month.

Insider Premium
$179.95/yr US PRICE

Next in Top Stories

Your guide to what matters today

Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus ad-free web experience, select discounts to partner offerings and MIT Technology Review events

    See details+

    What's Included

    Bimonthly home delivery and unlimited 24/7 access to MIT Technology Review’s website.

    The Download. Our daily newsletter of what's important in technology and innovation.

    Access to the Magazine archive. Over 24,000 articles going back to 1899 at your fingertips.

    Special Discounts to select partner offerings

    Discount to MIT Technology Review events

    Ad-free web experience

/
You've read all of your free articles this month. This is your last free article this month. You've read of free articles this month. or  for unlimited online access.