Skip to Content
Artificial intelligence

This AI lets you deepfake your voice to speak like Barack Obama

Advances in machine learning will soon make it possible to sound like yourself with a different age or gender—or impersonate someone else.
February 27, 2019
Ms. Tech | Getty images

Meet my alter ego, Katie: 

The accent, emotion, and intonation are all mine. But somehow I now sound like a youngish woman with a high-pitched voice.

My feminine “voice skin” was created by Modulate.ai, a company based in Cambridge, Massachusetts. The firm uses machine learning to copy, model, and manipulate the properties of voice in a powerful new way.

The technology goes far beyond the simple voice filters that can let you sound like Kylo Ren. Using this approach, it is possible to assume any age, gender, or tone you’d like, all in real time. Or to take on the voice of a celebrity. I can hold a lengthy phone conversation in the guise of Katie if I wish.

I visited Modulate’s headquarters to hear about the company’s technology and ambitions, and to discuss the ethical implications of using AI to copy someone else’s voice. In a sound-isolated booth, I tried out a few of the company’s voice skins.

Here’s my actual voice:

And here it is being fed through another persona:

And being changed between the two personas in real time.

The voice-modeling technology isn’t perfect; each new voice is a little warbly. But it’s remarkably good, and it improves by feeding on more of your voice data. And it shows how advances in machine learning are rapidly starting to alter digital reality. Modulate uses generative adversarial networks (GANs) to capture and model the audio properties of a voice signal. GANs pit two neural networks against each other in a battle to capture and reproduce the properties of a data set convincingly (see “The GANfather”).

Machine learning has made it possible to swap two people’s faces in a video, using software that can be downloaded free from the internet (see “Fake America great again”). AI researchers are using GANs and other techniques to manipulate visual scenes and even conjure up completely fake faces

Modulate has a demonstration voice skin of Barack Obama on its site, and cofounder and CEO Mike Pappas said it would be possible to generate one for anyone, given enough training data. But he adds that the company won’t make a celebrity voice skin available without the owner’s permission. He also insists that deception isn’t the main point.

“This isn’t technology built to imitate people,” Pappas says. “It’s built to give you new opportunities.”

Modulate is targeting online games such as Fornite or Call of Duty, in which players can chat with strangers through a microphone. This can enhance the game play, but it can also open the door to abuse and harassment.

“When we want to interact online and have really deep experiences, voices are crucial,” says Pappas. “But some people aren’t willing to actually put their voice out there. In some cases, maybe I just want to stay anonymous. In other cases, I’m worried that I’m going to reveal my age or gender and get harassed.”

Charles Seife, a professor at NYU who studies the spread of misinformation, says the technology seems significantly more advanced than other voice modification technology. And he says the way AI can now manipulate video and audio has the potential to fundamentally alter the media. “We have to start thinking about what constitutes reality,” he says.

"So far, the quality of voice conversion technology has been low so that one can easily distinguish a converted voice," adds Tuomas Virtanen, an expert on voice synthesis and manipulation at Tampere University in Finland. "But I can imagine that in the near future the quality will be good enough so that conversion cannot be detected easily."

Modulate is aware that its technology has the potential to be misused. The company says it will seek assurances that any customer copying someone’s voice has that person’s permission. It has also developed an audio watermarking technology that could be used to detect a copied voice. This could issue a warning if someone is using a fake voice on a call, for example. 

"We've built ethical safeguards into our company from the ground up," says cofounder and CTO, Carter Huffman, "from how we distribute our technology, to how we select the voice skins to offer, to watermarking our audio for detection in sensitive systems.”

Modulate might be able to limit the misuse of its own technology, but it’s possible others will develop similar technology independently, and make it available for people to misuse. The question is, how widely might this be misused, and how savvy about it will the public become?

Pappas is optimistic that the potential for AI fakery is often overblown. “It’s definitely something where you want to be cognizant of it, but it’s not something where the very facets of society are crumbling down,” he says. “We have tools to handle this.”

Deep Dive

Artificial intelligence

conceptual illustration showing various women's faces being scanned
conceptual illustration showing various women's faces being scanned

A horrifying new AI app swaps women into porn videos with a click

Deepfake researchers have long feared the day this would arrive.

computation concept
computation concept

How AI is reinventing what computers are

Three key ways artificial intelligence is changing what it means to compute.

digital twins concept
digital twins concept

How AI digital twins help weather the world’s supply chain nightmare

Just-in-time shipping is dead. Long live supply chains stress-tested with AI digital twins.

storm front
storm front

DeepMind’s AI predicts almost exactly when and where it’s going to rain

The firm worked with UK weather forecasters to create a model that was better at making short term predictions than existing systems.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.