DeepMind has developed a vast candy-colored virtual playground that teaches AIs general skills by endlessly changing the tasks it sets them. Instead of developing just the skills needed to solve a particular task, the AIs learn to experiment and explore, picking up skills they then use to succeed in tasks they’ve never seen before. It is a small step toward general intelligence.
What is it? XLand is a video-game-like 3D world that the AI players sense in color. The playground is managed by a central AI that sets the players billions of different tasks by changing the environment, the game rules, and the number of players. Both the players and the playground manager use reinforcement learning to improve by trial and error.
During training, the players first face simple one-player games, such as finding a purple cube or placing a yellow ball on a red floor. They advance to more complex multiplayer games like hide and seek or capture the flag, where teams compete to be the first to find and grab their opponent’s flag. The playground manager has no specific goal but aims to improve the general capability of the players over time.
Why is this cool? AIs like DeepMind’s AlphaZero have beaten the world’s best human players at chess and Go. But they can only learn one game at a time. As DeepMind cofounder Shane Legg put it when I spoke to him last year, it’s like having to swap out your chess brain for your Go brain each time you want to switch games.
Researchers are now trying to build AIs that can learn multiple tasks at once, which means teaching them general skills that make it easier to adapt.
One exciting trend in this direction is open-ended learning, where AIs are trained on many different tasks without a specific goal. In many ways, this is how humans and other animals seem to learn, via aimless play. But this requires a vast amount of data. XLand generates that data automatically, in the form of an endless stream of challenges. It is similar to POET, an AI training dojo where two-legged bots learn to navigate obstacles in a 2D landscape. XLand’s world is much more complex and detailed, however.
XLand is also an example of AI learning to make itself, or what Jeff Clune, who helped develop POET and leads a team working on this topic at OpenAI, calls AI-generating algorithms (AI-GAs). “This work pushes the frontiers of AI-GAs,” says Clune. “It is very exciting to see.”
What did they learn? Some of DeepMind’s XLand AIs played 700,000 different games in 4,000 different worlds, encountering 3.4 million unique tasks in total. Instead of learning the best thing to do in each situation, which is what most existing reinforcement-learning AIs do, the players learned to experiment—moving objects around to see what happened, or using one object as a tool to reach another object or hide behind—until they beat the particular task.
In the videos you can see the AIs chucking objects around until they stumble on something useful: a large tile, for example, becomes a ramp up to a platform. It is hard to know for sure if all such outcomes are intentional or happy accidents, say the researchers. But they happen consistently.
AIs that learned to experiment had an advantage in most tasks, even ones that they had not seen before. The researchers found that after just 30 minutes of training on a complex new task, the XLand AIs adapted to it quickly. But AIs that had not spent time in XLand could not learn these tasks at all.
A horrifying new AI app swaps women into porn videos with a click
Deepfake researchers have long feared the day this would arrive.
The therapists using AI to make therapy better
Researchers are learning more about how therapy works by examining the language therapists use with clients. It could lead to more people getting better, and staying better.
DeepMind says its new language model can beat others 25 times its size
RETRO uses an external memory to look up passages of text on the fly, avoiding some of the costs of training a vast neural network
2021 was the year of monster AI models
GPT-3, OpenAI’s program to mimic human language, kicked off a new trend in artificial intelligence for bigger and bigger models. How large will they get, and at what cost?
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.