Facebook helped create an AI scavenger hunt that could lead to the first useful home robots

To make AI programs smarter, researchers are creating virtual worlds for them to explore.

Will Knightarchive page

May 2, 2018

Creative Mania, concheese from the Noun Project | Erin Winick

Artificial-intelligence programs could develop some much-needed common sense by competing in scavenger hunts inside virtual homes filled with simulated coffee tables, couches, lamps, and other everyday things.

Researchers at Facebook and Georgia Tech developed the scavenger-hunt challenge. The contest requires a virtual agent to look for something in a simulated home after parsing a natural-language question. An agent would be placed in room of a virtual home at random and asked something like “What color is the car?” or “Where is the coffee table?” Finding the answer requires an agent to understand the question and then explore the virtual space in search of the relevant object.

“The goal is to build intelligent systems that can see, talk, plan, and reason,” says Devi Parikh, a computer scientist at Georgia Tech and Facebook AI Research (FAIR), who developed the contest with her colleague and husband, Dhruv Batra.

Parikh, Batra, and their collaborators developed an agent that combines several different forms of machine learning to answer questions about a home. The agent also learns a rudimentary form of common sense by figuring out, through lots of trial and error, the best places to look for a particular object. For instance, over time, the agent learns that cars are usually found in the garage, and it understands that garages can usually be found by going out the front or back door.

The approach relies on reinforcement learning, a form of machine learning inspired by animal behavior, as well as imitation learning, a technique that lets algorithms learn by observation. The virtual homes were created by researchers at FAIR and UC Berkeley. The research was highlighted during Facebook’s annual developer conference today.

An agent navigating a virtual home.

Embodied QA

A growing number of researchers are experimenting with virtual environments for training AI programs. The approach is seen as a way to broaden the intelligence of AI and overcome fundamental limitations. While there has been remarkable progress in AI lately, it has tended to involve computers doing a single task, like recognizing faces in images or playing a board game. What’s more, AI programs are generally trained on still images rather than in 3-D settings

As early AI research showed, it simply isn’t practical to hand-code such knowledge into a system (see “AI’s language problem”). So the solution will most likely be for AI programs to learn such knowledge for themselves.

Microsoft has released an environment called Malmo, which is based on the game Minecraft. Researchers at the Allen Institute for AI (Ai2) in Seattle developed another 3-D virtual environment for training AI agents. This environment also reflects basic physics, and it allows agents to take simple actions. The Ai2 researchers have proposed a similar set of natural-language challenges for agents in their environment.

Roozbeh Mottaghi, the lead researcher behind the Ai2 project, says it is crucial for these virtual environments to become more realistic if we want AI agents to learn properly inside them. Currently, this isn’t really practical. “Designing a single realistic-looking room might take months, and it is costly,” he says. “And defining realistic physical properties for every object is very challenging.”

In the near term the work could help make chatbots and personal assistants less stubbornly dumb. Progress on more open-ended tasks, such as understanding natural language, has been slower. A machine can be taught to repeat patterns in text, but coping with the ambiguity of language usually requires some common-sense knowledge of the real world. The common sense developed by exploring virtual environments could help chatbots and personal assistants converse without making so many errors.

Facebook knows this challenge firsthand. The company launched a general-purpose virtual assistant, called M, in 2015. But it relied on humans to take over when the underlying software failed to understand a command or query. The product never really took off, and it was discontinued last year.

The research may also feed into more futuristic projects. Imagine asking a Roomba to go vacuum the bedroom. Even if the machine could understand your voice and see its surroundings, it has no idea what a bedroom is, or where one might be found. But future home robots might use AI software that has learned such simple facts about ordinary homes by exploring lots of virtual homes first.

“We are clearly headed into an age of assistive agents,” says Batra. Referring to Amazon’s Echo device and rumors that the company is working on a home robot, he adds, “These things will develop eyes, and after that they will follow you around.”

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.