“What do you think of my artificial voice?” asks a woman on a computer screen, her green eyes widening slightly. The image is clearly computerized, and the voice is halting, but it’s still a remarkable moment. The image is a digital avatar of a person who lost her ability to speak after a stroke 18 years ago. Now, as part of an experiment involving a brain implant and AI algorithms, she can speak with a replication of her own voice and even convey a limited range of facial expressions via her avatar.
A pair of papers published in Nature today from two independent research teams show just how quickly this field is advancing—though these proofs of concept are still a very long way from tech that’s available to the wider public. Each study involved a woman who had lost her ability to speak intelligibly–one after a brain-stem stroke and the other because of ALS, a progressive neurodegenerative disease.
The participants each had a different type of recording device implanted in her brain, and both managed to speak at a rate of about 60 to 70 words per minute. That’s roughly half the rate of normal speech, but more than four times faster than had been previously reported. One team, led by Edward Chang, a neurosurgeon at the University of California, San Francisco, also captured the brain signals controlling the small movements that provide facial expressions, allowing them to create the avatar that represented the study participant’s speech in close to real time.
The papers “represent really elegant and rigorous science and engineering for the brain,” says Judy Illes, a neuroethicist at the University of British Columbia in Vancouver, Canada, who was not involved in either study. Illes especially appreciated the addition of an expressive avatar. “Communication is not just about words between people. It’s about words and messages that are communicated through tonality, expression, accent, context,” she says. “I think it was creative and quite thoughtful to try to bring that component of personhood to what is really fundamental science, engineering, neurotechnology.”
Chang and his team have been working on the problem for more than a decade. In 2021, they demonstrated that they could capture brain activity from a person who had suffered a brain-stem stroke and translate those signals into written words and sentences, albeit slowly. In the latest paper, the team used a larger implant with double the number of electrodes—a device about the size of a credit card—to capture signals from the brain of another patient, named Ann, who lost her ability to speak after a stroke nearly two decades ago.
The implant doesn’t record thoughts. Instead it captures the electrical signals that control the muscle movements of the lips, tongue, jaw, and voice box—all the movements that enable speech. For example, “if you make a P sound or a B sound, it involves bringing the lips together. So that would activate a certain proportion of the electrodes that are involved in controlling the lips,” says Alexander Silva, a study author and graduate student in Chang’s lab. A port that sits on the scalp allows the team to transfer those signals to a computer, where AI algorithms decode them and a language model helps provide autocorrect capabilities to improve accuracy. With this technology, the team translated Ann’s brain activity into written words at a rate of 78 words per minute, using a 1,024-word vocabulary, with an error rate of 23%.
Chang’s group also managed to decode brain signals directly into speech, a first for any group. And the muscle signals it captured allowed the participant, via the avatar, to express three different emotions—happy, sad, and surprised—at three different levels of intensity. “Speech isn’t just about communicating just words but also who we are. Our voice and expressions are part of our identity,” Chang says. The trial participant hopes to become a counselor. It’s “my moonshot,” she told the researchers. She thinks this kind of avatar might make her clients feel more at ease. The team used a recording from her wedding video to replicate her speaking voice, so the avatar even sounds like her.
The second team, led by researchers from Stanford, first posted its results as a preprint in January. The researchers gave a participant with ALS, named Pat Bennett, four much smaller implants—each about the size of an aspirin—that can record signals from single neurons. Bennett trained the system by reading syllables, words, and sentences over the course of 25 sessions.
The researchers then tested the technology by having her read sentences that hadn’t been used during training. When those sentences were drawn from a vocabulary of 50 words, the error rate was about 9%. When the team expanded the vocabulary to 125,000 words, which encompasses much of the English language, the error rate rose to about 24%.
Speech using these interfaces isn’t seamless. It’s still slower than normal speaking, and while an error rate of 23% or 24% is far better than previous results, it’s still not great. In some instances, the system replicated sentences perfectly. In others, “How is your cold?” came out as “Your old.”
But scientists are convinced they can do better. “What’s exciting is that as you add more of these electrodes, the decoder performance keeps going up,” says Francis Willett, a neuroscientist and author on the Stanford paper. “If we can get more electrodes and more neurons, then we should be able to be even more accurate.”
The current systems aren’t practical for home use. Because they rely on wired connections and a bulky computer system to handle the processing, the women can’t use the brain implants to communicate outside of the experiment. “There’s a whole lot of work still there to turn this knowledge into something useful for people with unmet needs,” says Nick Ramsey, a neuroscientist at the UMC Utrecht Brain Center in Amsterdam and author of an accompanying commentary.
Illes also cautions that each team is reporting results from a single individual, and they may not hold for other people, even those with similar neurological conditions. “This is a proof of concept,” she says. “We know that brain injury is really messy and highly heterogeneous. Generalizability even within the stroke population or the ALS population—it’s possible, but it’s not certain.”
But it does open up the possibility of a technological solution for people who lose the ability to communicate. “What we’ve done is to prove that it’s possible and that there is a pathway to do it,” Chang says.
Being able to speak is crucial. The participant in Chang’s study used to rely on a letterboard to communicate. “My husband was so sick of having to get up and translate the letterboard for me,” she told researchers. “We didn’t argue, because he didn’t give me a chance to argue back. As you can imagine, this frustrated me greatly!”
Biotechnology and health
The Biggest Questions: What is death?
New neuroscience is challenging our understanding of the dying process—bringing opportunities for the living.
Some deaf children in China can hear after gene therapy treatment
After deafness treatment, Yiyi can hear her mother and dance to the music. But why is it so noisy at night?
Scientists just drafted an incredibly detailed map of the human brain
A massive suite of papers offers a high-res view of the human and non-human primate brain.
Three people were gene-edited in an effort to cure their HIV. The result is unknown.
CRISPR is being used in an experimental effort to eliminate the virus that causes AIDS.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.