MIT Technology Review Subscribe

There’s a new way to have robots learn from their mistakes

By thinking of every incorrect action in one task as a way to do part of a different one, we can give AI the gift of hindsight.

Background: When humans mess up, they can learn several things: that an approach to a task didn’t work, but also that the method they just tried might be helpful for some other job. But when robots try to master tasks by themselves, they typically only learn  by getting a reward for every step of a job they do correctly.

Advertisement

Useful mistakes: IEEE Spectrum report that OpenAI, a nonprofit research company, released free software called Hindsight Experience Replay (HER) that lets an AI’s “failures” become successes. It does that by looking at how every attempt at one task could be applied to others. (The software also includes virtual environments where AIs can practice things like picking up objects or holding a pen.)

This story is only available to subscribers.

Don’t settle for half the story.
Get paywall-free access to technology news for the here and now.

Subscribe now Already a subscriber? Sign in
You’ve read all your free stories.

MIT Technology Review provides an intelligent and independent filter for the flood of information about technology.

Subscribe now Already a subscriber? Sign in

More realistic robo-training: HER doesn’t give robots rewards for getting a step of a task right—it only hands them out if the entire thing is done properly. That’s closer to how robots will learn in real life, but it usually slows training right down. Still, because every failed attempt can also get used for another job, that’s less of a problem in OpenAI’s system.

This is your last free story.
Sign in Subscribe now

Your daily newsletter about what’s up in emerging technology from MIT Technology Review.

Please, enter a valid email.
Privacy Policy
Submitting...
There was an error submitting the request.
Thanks for signing up!

Our most popular stories

Advertisement