Skip to Content

Amazon’s New Robo-Picker Champion Is Proudly Inhuman

It only needs to see seven images of a new object before it can reliably spot and grab it.

A robot that owes rather a lot to an annoying arcade game has captured victory in Amazon’s annual Robotics Challenge.

E-commerce companies like Amazon and Ocado, the world’s largest online-only grocery retailer, currently boast some of the most heavily automated warehouses in the world. But items for customers’ orders aren’t picked by robots, because machines cannot yet reliably grasp a wide range of different objects.

That’s why Amazon gathers together researchers each year to test out machines that pick and stow objects. It’s a tough job, but one that could ultimately help the company to fully automate its warehouses. This year the task was made even harder than usual: teams had only 30 minutes for their robots to familiarize themselves with the objects before trying to pick them out of a jumble of items. That, says Amazon, is supposed to better simulate warehouse conditions, where new stock is arriving all the time and pallets may not be neatly organized.

The winner, a robot called Cartman, was built by the Australian Centre for Robotic Vision. Unlike many competitors, which used robot arms to carry out the tasks, Cartman is distinctly inhuman, with its grippers moving in 3-D along straight lines like an arcade claw crane. But it works far, far better. According to Anton Milan, one of Cartman’s creators, the device’s computer-vision systems were crucial to the victory. “One feature of our system was that it worked off a very small amount of hand annotated training data,” he explained to TechAU. “We only needed just seven images of each unseen item for us to be able to detect them.”

That kind of fast learning is a huge area of research for machine learning experts. Last year, DeepMind showed off a so-called “one-shot” learning system, that can identify objects in a image after having only seen them once before. But the need to identify objects that are obscured by other items and pick them up means that Cartman needs a little more data than that.

(Read more: TechAU, “Robot, Get the Fork Out of My Sink,” “Machines Can Now Recognize Something After Seeing It Once,” “Inside Amazon”)

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.