By thinking of every incorrect action in one task as a way to do part of a different one, we can give AI the gift of hindsight.
Background: When humans mess up, they can learn several things: that an approach to a task didn’t work, but also that the method they just tried might be helpful for some other job. But when robots try to master tasks by themselves, they typically only learn by getting a reward for every step of a job they do correctly.
Useful mistakes: IEEE Spectrum report that OpenAI, a nonprofit research company, released free software called Hindsight Experience Replay (HER) that lets an AI’s “failures” become successes. It does that by looking at how every attempt at one task could be applied to others. (The software also includes virtual environments where AIs can practice things like picking up objects or holding a pen.)
More realistic robo-training: HER doesn’t give robots rewards for getting a step of a task right—it only hands them out if the entire thing is done properly. That’s closer to how robots will learn in real life, but it usually slows training right down. Still, because every failed attempt can also get used for another job, that’s less of a problem in OpenAI’s system.
Geoffrey Hinton tells us why he’s now scared of the tech he helped build
“I have suddenly switched my views on whether these things are going to be more intelligent than us.”
ChatGPT is going to change education, not destroy it
The narrative around cheating students doesn’t tell the whole story. Meet the teachers who think generative AI could actually make learning better.
Deep learning pioneer Geoffrey Hinton has quit Google
Hinton will be speaking at EmTech Digital on Wednesday.
We are hurtling toward a glitchy, spammy, scammy, AI-powered internet
Large language models are full of security vulnerabilities, yet they’re being embedded into tech products on a vast scale.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.