Skip to Content
Uncategorized

OpenAI’s Goofy Sumo-Wrestling Bots Are Smarter Than They Look

October 12, 2017

It could be a virtual blood sport in some absurdist techno-future.

OpenAI, a research institute backed by Elon Musk and several other Silicon Valley big shots, has revealed its latest research on developing more powerful forms of machine learning. And it’s demonstrating the technology using virtual sumo wrestling.

The virtual wrestlers might look slightly ridiculous, but they are using a very clever approach to learning in a fast-changing environment while dealing with an opponent.

The agents use a form of reinforcement learning, a technique inspired by the way animals learn through feedback. It has proved useful for training computers to play games and to control robots (see “10 Breakthrough Technologies 2017: Reinforcement Learning”).

One big challenge with using reinforcement learning is that it doesn’t work so well in more realistic situations, where things are constantly in flux. OpenAI already developed its own reinforcement algorithm called proximal policy optimization (PPO), which is especially well suited to changing environments.

The latest work, done in collaboration with researchers from Carnegie Mellon University and UC Berkeley, demonstrates a way for AI agents to apply what the researchers call a “meta-learning” framework. This means the agents can take what they have already learned and apply it to a new situation.

Inside the RoboSumo environment (see video above), the agents started out behaving randomly. Through thousands of iterations of trial and error, they gradually developed the ability to move—and, eventually, to fight. Through further iterations, the wrestlers developed the ability to avoid each other, and even to question their own actions. This learning happened on the fly, with the agents adapting even they wrestled each other.

Flexible learning is a very important part of human intelligence, and it will be crucial if machines are going to become capable of performing anything other than very narrow tasks in the real world. This kind of learning is very difficult to implement in machines, and the latest work is a small but significant step in that direction.

The researchers found that by using meta-learning, their sumo-bots could learn effective strategies more quickly. So even if they look a bit hapless, don’t underestimate them.

Deep Dive

Uncategorized

Embracing CX in the metaverse

More than just meeting customers where they are, the metaverse offers opportunities to transform customer experience.

Identity protection is key to metaverse innovation

As immersive experiences in the metaverse become more sophisticated, so does the threat landscape.

The modern enterprise imaging and data value chain

For both patients and providers, intelligent, interoperable, and open workflow solutions will make all the difference.

Scientists have created synthetic mouse embryos with developed brains

The stem-cell-derived embryos could shed new light on the earliest stages of human pregnancy.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.