We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not a subscriber? Subscribe now for unlimited access to online articles.

  • Ms. Tech | Getty images
  • Intelligent Machines

    This AI lets you deepfake your voice to speak like Barack Obama

    Advances in machine learning will soon make it possible to sound like yourself with a different age or gender—or impersonate someone else.

    Meet my alter ego, Katie: 

    The accent, emotion, and intonation are all mine. But somehow I now sound like a youngish woman with a high-pitched voice.

    My feminine “voice skin” was created by Modulate.ai, a company based in Cambridge, Massachusetts. The firm uses machine learning to copy, model, and manipulate the properties of voice in a powerful new way.

    The technology goes far beyond the simple voice filters that can let you sound like Kylo Ren. Using this approach, it is possible to assume any age, gender, or tone you’d like, all in real time. Or to take on the voice of a celebrity. I can hold a lengthy phone conversation in the guise of Katie if I wish.

    I visited Modulate’s headquarters to hear about the company’s technology and ambitions, and to discuss the ethical implications of using AI to copy someone else’s voice. In a sound-isolated booth, I tried out a few of the company’s voice skins.

    Here’s my actual voice:

    And here it is being fed through another persona:

    And being changed between the two personas in real time.

    The voice-modeling technology isn’t perfect; each new voice is a little warbly. But it’s remarkably good, and it improves by feeding on more of your voice data. And it shows how advances in machine learning are rapidly starting to alter digital reality. Modulate uses generative adversarial networks (GANs) to capture and model the audio properties of a voice signal. GANs pit two neural networks against each other in a battle to capture and reproduce the properties of a data set convincingly (see “The GANfather”).

    Machine learning has made it possible to swap two people’s faces in a video, using software that can be downloaded free from the internet (see “Fake America great again”). AI researchers are using GANs and other techniques to manipulate visual scenes and even conjure up completely fake faces

    Sign up for the The Algorithm
    Artificial intelligence, demystified

    Modulate has a demonstration voice skin of Barack Obama on its site, and cofounder and CEO Mike Pappas said it would be possible to generate one for anyone, given enough training data. But he adds that the company won’t make a celebrity voice skin available without the owner’s permission. He also insists that deception isn’t the main point.

    “This isn’t technology built to imitate people,” Pappas says. “It’s built to give you new opportunities.”

    Modulate is targeting online games such as Fornite or Call of Duty, in which players can chat with strangers through a microphone. This can enhance the game play, but it can also open the door to abuse and harassment.

    “When we want to interact online and have really deep experiences, voices are crucial,” says Pappas. “But some people aren’t willing to actually put their voice out there. In some cases, maybe I just want to stay anonymous. In other cases, I’m worried that I’m going to reveal my age or gender and get harassed.”

    Charles Seife, a professor at NYU who studies the spread of misinformation, says the technology seems significantly more advanced than other voice modification technology. And he says the way AI can now manipulate video and audio has the potential to fundamentally alter the media. “We have to start thinking about what constitutes reality,” he says.

    "So far, the quality of voice conversion technology has been low so that one can easily distinguish a converted voice," adds Tuomas Virtanen, an expert on voice synthesis and manipulation at Tampere University in Finland. "But I can imagine that in the near future the quality will be good enough so that conversion cannot be detected easily."

    Modulate is aware that its technology has the potential to be misused. The company says it will seek assurances that any customer copying someone’s voice has that person’s permission. It has also developed an audio watermarking technology that could be used to detect a copied voice. This could issue a warning if someone is using a fake voice on a call, for example. 

    "We've built ethical safeguards into our company from the ground up," says cofounder and CTO, Carter Huffman, "from how we distribute our technology, to how we select the voice skins to offer, to watermarking our audio for detection in sensitive systems.”

    Modulate might be able to limit the misuse of its own technology, but it’s possible others will develop similar technology independently, and make it available for people to misuse. The question is, how widely might this be misused, and how savvy about it will the public become?

    Pappas is optimistic that the potential for AI fakery is often overblown. “It’s definitely something where you want to be cognizant of it, but it’s not something where the very facets of society are crumbling down,” he says. “We have tools to handle this.”

    Keep up with the latest in artificial intelligence at EmTech Digital.

    The Countdown has begun.
    March 25-26, 2019
    San Francisco, CA

    Register now
    More from Intelligent Machines

    Artificial intelligence and robots are transforming how we work and live.

    Want more award-winning journalism? Subscribe to Print + All Access Digital.
    • Print + All Access Digital {! insider.prices.print_digital !}*

      {! insider.display.menuOptionsLabel !}

      The best of MIT Technology Review in print and online, plus unlimited access to our online archive, an ad-free web experience, discounts to MIT Technology Review events, and The Download delivered to your email in-box each weekday.

      See details+

      12-month subscription

      Unlimited access to all our daily online news and feature stories

      6 bi-monthly issues of print + digital magazine

      10% discount to MIT Technology Review events

      Access to entire PDF magazine archive dating back to 1899

      Ad-free website experience

      The Download: newsletter delivery each weekday to your inbox

      The MIT Technology Review App

    You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.