The robot arm is performing a peculiar kind of Sisyphean task. It hovers over a glistening pile of cooked chicken parts, dips down, and retrieves a single piece. A moment later, it swings around and places the chunk of chicken, ever so gently, into a bento box moving along a conveyor belt.
This robot, controlled by software from a San Francisco–based company called Osaro, is smarter than any you’ve seen before. The software has taught it to pick and place chicken in about five seconds. Within the year, Osaro expects its robots to find work in a Japanese food factory.
Anyone worried about a robot uprising need only step inside a modern factory to see how far away that is. Most robots are powerful and precise but can’t do anything unless programmed meticulously. An ordinary robot arm lacks the sense needed to pick up an object if it is moved an inch. It is completely hopeless at gripping something unfamiliar; it doesn’t know the difference between a marshmallow and a cube of lead. Picking up irregularly shaped pieces of chicken from a haphazard pile is an act of genius.
Industrial robots have been largely untouched by the latest advances in artificial intelligence. Over the last five or so years, AI software has become adept at identifying images, winning board games, and responding to a person’s voice with virtually no human intervention. It can even teach itself new abilities, given enough time to practice. All this while AI’s hardware cousins, robots, struggle to open a door or pick up an apple.
That is about to change. The AI software that controls Osaro’s robot lets it identify the objects in front of it, study how they behave when poked, pushed, and grasped, and then decide how to handle them. Like other AI algorithms, it learns from experience. Using an off-the-shelf camera combined with machine-learning software on a powerful computer nearby, it figures out how to grasp things effectively. With enough trial and error, the arm can learn how to grasp just about anything it might come across.
Workplace robots equipped with AI will let automation creep into many more areas of work. They could replace people anywhere that products need to be sorted, unpacked, or packed. Able to navigate a chaotic factory floor, they might take yet more jobs in manufacturing. It might not be an uprising, but it could be a revolution nonetheless. “We’re seeing a lot of experimentation now, and people are trying a lot of different things,” says Willy Shih, who studies trends in manufacturing at Harvard Business School. “There’s a huge amount of possibility for [automating] repetitive tasks.”
It’s a revolution not just for the robots, but for AI, too. Putting AI software in a physical body allows it to use visual recognition, speech, and navigation out in the real world. Artificial intelligence gets smarter as it feeds on more data. So with every grasp and placement, the software behind these robots will become more and more adept at making sense of the world and how it works.
“This could lead to advances that wouldn’t be possible without all that data,” says Pieter Abbeel, a professor at the University of California, Berkeley, and the founder of Covariant.ai (until recently called Embodied Intelligence), a startup applying machine learning and virtual reality to robotics in manufacturing.
Separated at birth
This era has been a long time coming. In 1954, George C. Devol, an inventor, patented a design for a programmable mechanical arm. In 1961, a manufacturing entrepreneur named Joseph Engelberger turned the design into the Unimate, an unwieldy, awkward machine first used on a General Motors assembly line in New Jersey.
From the beginning, there was a tendency to romanticize the intelligence behind these simple machines. Engelberger chose the name “robot” for the Unimate in honor of the androids dreamed up by the science fiction author Isaac Asimov. But his machines were crude mechanical devices directed to perform a specific task by relatively simple software. Even today’s much more advanced robots remain little more than mechanical dunces that must be programmed for every action.
Artificial intelligence followed a different path. In the 1950s, it set out to use the tools of computing to mimic human-like logic and reason. Some researchers also sought to give these systems a physical presence. As early as 1948 and 1949, William Grey Walter, a neuroscientist in Bristol, UK, developed two small autonomous machines that he dubbed Elsie and Elmer. These turtle-like devices were equipped with simple, neurologically inspired circuits that let them follow a light source on their own. Walter built them to show how the connections between just a few neurons in the brain might result in relatively complex behavior.
But understanding and re-creating intelligence proved to be a byzantine challenge, and AI went into a long period with few breakthroughs. Meanwhile, programming physical machines to do useful things in the messy real world often proved intractably complex. AI and robots have been stablemates in research labs for decades, and researchers have tried applying machine learning to industrial robots, but that has not yet taken off in industry.
Then, about six years ago, researchers figured out how to make an old AI trick incredibly powerful. The scientists were using neural networks—algorithms that approximate, roughly speaking, the way neurons and synapses in the brain learn from input. These networks were, it turns out, direct descendants of the components that gave Elsie and Elmer their abilities. The researchers discovered that very large, or “deep,” neural networks could do remarkable things when fed huge quantities of labeled data, such as recognizing the object shown in an image with near-human perfection.
The field of AI was turned upside down. Deep learning, as the technique is commonly known, is now widely used for tasks involving perception: face recognition, speech transcription, and training self-driving cars to identify pedestrians and signposts. It has made it possible to imagine a robot that could recognize your face, speak intelligently to you, and navigate safely to the kitchen to get you a soda from the fridge.
The man behind Osarou2019s smarter robot
Osaro’s CEO, Derik Pridmore, studied physics and computer science at MIT before joining a West Coast VC firm called Founders Fund. While there, Pridmore identified DeepMind, a British AI company, as an investment target, and he worked with the company’s founders to hone their pitch. DeepMind would go on to teach machines to do things that seemed impossible at the time. Famously, it developed AlphaGo, the program that beat the top-ranked human grandmaster at the board game Go.
When Google acquired DeepMind in 2014, Pridmore decided that AI had commercial potential. He founded Osaro and quickly zeroed in on robot picking as the ideal application. Grasping objects loaded in a bin or rolling along a conveyor belt is a simple task for a human, but it requires genuine intelligence.
The techniques DeepMind pioneered, known as “deep reinforcement learning,” let machines perform complex tasks without learning from human-provided examples. Positive feedback, like getting a higher score in a video game, tunes the network and moves the algorithm closer to the goal until it becomes expert.
The reasoning that makes this possible is buried deep within the network, encoded in the interplay of tens of millions of interconnected simulated neurons. But the resulting behavior can seem simple and instinctual. With enough practice, an arm can learn to pick things up efficiently, even when an object is moved, hidden by another object, or shaped a bit differently. Osaro uses deep reinforcement learning, along with several other machine-learning techniques, to make industrial robots a lot cleverer.
One of the first skills that AI will give machines is far greater dexterity. For the past few years, Amazon has been running a “robot picking” challenge in which researchers compete to have a robot pick up a wide array of products as quickly as possible. All of these teams are using machine learning, and their robots are gradually getting more proficient. Amazon, clearly, has one eye on automating the picking and packing of billions of items within its fulfillment centers.
“I’ve been working in robotic grasping for 35 years, and we’ve made very little progress,” says Ken Goldberg, a colleague of Abbeel’s at UC Berkeley. Thanks to advances in AI that is changing: “We are now poised to make a big leap forward.”
AI gets a body
In the NoHo neighborhood of New York, one of the world’s foremost experts on artificial intelligence is currently looking for the field’s next big breakthrough. And he thinks that robots might be an important piece of the puzzle.
Yann LeCun played a vital role in the deep-learning revolution. During the 1980s, when other researchers dismissed neural networks as impractical, LeCun persevered. As head of Facebook’s AI research until January, and now as its chief AI scientist, he led the development of deep-learning algorithms that can identify users in just about any photo a person posts.
But LeCun wants AI to do more than just see and hear; he wants it to reason and take action. And he says it needs a physical presence to make this possible. Human intelligence involves interacting with the real world; human babies learn by playing with things. AI embedded in grasping machines can do the same. “A lot of the most interesting AI research now involves robots,” LeCun says.
A remarkable kind of machine evolution might even result, mirroring the process that gave rise to biological intelligence. Vision, dexterity, and intelligence began evolving together at an accelerated rate once hominids started walking upright, using their two free hands to examine and manipulate objects. Their brains grew bigger, enabling more advanced tools, language, and social organization.
Could AI experience something similar? Until now, it has existed largely inside computers, interacting with crude simulations of the real world, such as video games or still images. AI programs capable of perceiving the real world, interacting with it, and learning about it might eventually become far better at reasoning and even communicating. “If you solve manipulation in its fullest,” Abbeel says, “you’ll probably have built something that’s pretty close to full, human-level intelligence.”
Correction: An earlier version of this story suggested that AI and robotics research have been largely separate fields for decades. Some changes were made to clarify that the separation was largely in commercial applications rather than in the research lab.
AI for everything: 10 Breakthrough Technologies 2024
Generative AI tools like ChatGPT reached mass adoption in record time, and reset the course of an entire industry.
What’s next for AI in 2024
Our writers look at the four hot trends to watch out for this year
OpenAI teases an amazing new generative video model called Sora
The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.
Google’s Gemini is now in everything. Here’s how you can try it out.
Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.