These robots know when to ask for help

Large language models combined with confidence scores help them recognize uncertainty. That could be key to making robots safe and trustworthy.

June Kimarchive page

December 8, 2023

Allen Ren et al./Princeton University

There are two bowls on the kitchen table: one made of plastic, the other metal. You ask the robot to pick up the bowl and put it in the microwave. Which one will it choose?

A human might ask for clarification, but given the vague command, the robot may place the metal bowl in the microwave, causing sparks to fly.

A new training model, dubbed “KnowNo,” aims to address this problem by teaching robots to ask for our help when orders are unclear. At the same time, it ensures they seek clarification only when necessary, minimizing needless back-and-forth. The result is a smart assistant that tries to make sure it understands what you want without bothering you too much.

Andy Zeng, a research scientist at Google DeepMind who helped develop the new technique, says that while robots can be powerful in many specific scenarios, they are often bad at generalized tasks that require common sense.

For example, when asked to bring you a Coke, the robot needs to first understand that it needs to go into the kitchen, look for the refrigerator, and open the fridge door. Conventionally, these smaller substeps had to be manually programmed, because otherwise the robot would not know that people usually keep their drinks in the kitchen.

That’s something large language models (LLMs) could help to fix, because they have a lot of common-sense knowledge baked in, says Zeng.

Now when the robot is asked to bring a Coke, an LLM, which has a generalized understanding of the world, can generate a step-by-step guide for the robot to follow.

The problem with LLMs, though, is that there’s no way to guarantee that their instructions are possible for the robot to execute. Maybe the person doesn’t have a refrigerator in the kitchen, or the fridge door handle is broken. In these situations, robots need to ask humans for help.

KnowNo makes that possible by combining large language models with statistical tools that quantify confidence levels.

When given an ambiguous instruction like “Put the bowl in the microwave,” KnowNo first generates multiple possible next actions using the language model. Then it creates a confidence score predicting the likelihood that each potential choice is the best one.

These confidence estimates are sized up against a predetermined certainty threshold, which indicates exactly how confident or conservative the user wants a robot to be in its actions. For example, a robot with a success rate of 80% should make the correct decision at least 80% of the time.

This is useful in situations with varying degrees of risk, says Anirudha Majumdar, an assistant professor of mechanical and aerospace engineering at Princeton and the senior author of the study.

You may want your cleaning robot to be more independent, despite a few mistakes here and there, so that you don’t have to supervise it too closely. But for medical applications, robots must be extremely cautious, with the highest level of success possible.

When there is more than one option for how to proceed, the robot pauses to ask for clarification instead of blindly continuing: “Which bowl should I pick up—the metal or the plastic one?”

KnownNo was tested on three robots in more than 150 different scenarios. Results showed that KnowNo-trained robots had more consistent success rates while needing less human assistance than those trained without the same statistical calculations. The paper describing the research was presented at the Conference on Robot Learning in November.

Because human language is often ambiguous, teaching robots to recognize and respond to uncertainty can improve their performance.

Studies show that people prefer robots that ask questions, says Dylan Losey, an assistant professor at Virginia Tech who specializes in human-robot interaction and was not involved in this research. When robots reach out for help, it increases transparency about how they’re deciding what to do, which leads to better interactions, he says.

Allen Ren, a PhD student at Princeton and the study’s lead author, says there are several ways to improve KnowNo. Right now, it assumes robots’ vision is always reliable, which may not be the case with faulty sensors. Also, the model can be updated to factor in potential errors coming from human help.

AI’s ability to express uncertainty will make us trust robots more, says Majumdar. “Quantifying uncertainty is a missing piece in a lot of our systems,” he says. “It allows us to be more confident about how safe and successful the robots will be.”

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

Will Douglas Heavenarchive page

What’s next for generative video

OpenAI's Sora has raised the bar for AI moviemaking. Here are four things to bear in mind as we wrap our heads around what's coming.

Will Douglas Heavenarchive page

The AI Act is done. Here’s what will (and won’t) change

The hard work starts now.

Melissa Heikkiläarchive page

Is robotics about to have its own ChatGPT moment?

Researchers are using generative AI and other techniques to teach robots new skills—including tasks they could perform in homes.

Melissa Heikkiläarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

These robots know when to ask for help

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Is robotics about to have its own ChatGPT moment?

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Is robotics about to have its own ChatGPT moment?

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review