I Was There When is an oral history project that’s part of the In Machines We Trust podcast. It features stories of how breakthroughs and watershed moments in artificial intelligence and computing happened, as told by the people who witnessed them. In this episode we meet Alex Serdiuk, founder and CEO of Respeecher.
- Star Wars: The Empire Strikes Back. Lucasfilm LTD, 2004.
- The Mandalorian Luke Skywalker Deepfake, via YouTube.
This project was produced by Jennifer Strong, Anthony Green and Emma Cillekens. It was edited by Mat Honan and mixed by Garret Lang with original music by Jacob Gorski. The art is from Eric Mongeon and Stephanie Arnett.
Darth Vader: “No. I am your father.”
Jennifer: That’s the sound of one of the most recognizable villains in Hollywood history, Darth Vader.
For the past 45 years the legendary character has been performed by James Earl Jones. But in a spin off TV series for the Star Wars franchise, called Obi-Wan Kenobi… his voice is actually AI—trained by Ukrainian company, Reespeecher.
They’re also responsible for this voice…
Luke Skywalker: "He is strong with the force. But talent without training is nothing."
Jennifer: Young Luke in The Mandalorian series.. And others… Created for Disney.… sometimes from a bomb shelter in the midst of a war that is a very much real.
I’m Jennifer Strong, and this is I Was There When—an oral history project featuring the stories of breakthroughs and watershed moments in AI and computing, as told by those who witnessed them.
This episode, we meet the man at the helm of a company using machines to bring voices to life.
Alex Serdiuk: I was there when a very big Hollywood studio, one of the major studios, said that they have a project for us and they want to try it. And then once we signed all the paperwork. We received data and we understood what an iconic voice we have to work with and what an honor. It was a moment when we understood that this technology can change a lot in the content making industry.
Alex Serdiuk: My name is Alex Serdiuk. I am the founder and CEO at Respeecher. We provide several different services for clients in the audio domain, and that's mostly about converting speech of one person into a speech of another particular person. So you might have shared our work. And for example, we did voice of young Luke Skywalker in The Mandalorian in The Book of Boba Fett, just recently got credited in four of six episodes of Obi-Wan Kenobi.
Alex Serdiuk: We were in charge of the voice of Vince Lombardi in the Super Bowl 2021 opening. So we managed to get the synthetic speech, the synthetic sound, synthetic voice that is almost indistinguishable from something that was just recorded. But one of the biggest voices we've worked so far has been Luke Skywalker's voice in The Mandalorian. That was the first project we started to work with this voice. And yeah, when we received this task, when we received those recordings and understood that it wouldn't be an… an easy project for us. The data was old. The data was from different sources, and the goal is to make it sound like it was recorded yesterday by Mark Hamel, who is 30 years old. As well as all this information regarding the fact that the Luke characters would appear in the very end of Mandalorian has been extremely sensitive. So we understood how, how important task lay on our shoulders.
Alex Serdiuk: Uh, yeah, I mean, it's hard to go back because we live a bit different life here now. We, we saw what's going on back in the end of 2021. Uh, everyone was worried, but no one actually expected that scale of invasion. Uh, as a company, we prepared some contingency plans and just three, four weeks before the full blown invasion, we relocated big part of the team to western regions of Ukraine, where it's obviously a bit safer. And we basically suggested our folks to, to just stay there for some time to see how it goes.
Alex Serdiuk: So when missiles started to hit our big cities on February 24th, we had like just half of Kiev team in Kiev. It was not easy to understand the scale of invasion and how rude the invasion goes against civilians. It took some time to get used to war if one can get used to war, but luckily we managed to keep Respeecher absolutely operational. So we had no disruption in our work and actually some of the files, some of the work for Obi-Wan Kenobi, we delivered from bomb shelters on February 24th. We are extremely proud of how our country, how our nation responds.
Alex Serdiuk: We saw this huge resilience and we've been part of this resilience. We need just internet, good headphones, electricity to work. But now most of us are back in the Kiev office and we keep pushing it. We keep working. We are very grateful that none of our clients stopped their contracts. We showed that we can work and operate and they believed, and then we proved that we can work and operate quite efficiently and the company grows. We introduced new directions at Respeecher. Opened this stream of democratized version of the technology, the voice marketplace that's now available for small creators and more and more people use it. We launched healthcare initiative. Where we can help people with distorted speech, like laryngectomy patients to be able to convert their voice even on a fly, using a real time system into a better sounding voice. And I'm extremely proud of our team because when war starts, everyone thinks about where would I be most helpful and many found an answer that in doing what I do best. And here we employ people, we bring money to Ukrainian economy. Then after you are able to get your family, your kids out of Ukraine, uh, so you are sure they're in safe place, you actually can pay way more attention to work and find some rescue in work because you don't read news when you are focused on doing something you used to do and you might be even more efficient.
Alex Serdiuk: So now I work very long hours because if I wouldn't be working long hours, I would be anxious about what's happening around. There was one moment when we had to deliver a project in cooperation with Metaphysic—the company that does state of the art deep fake. They did this piece, Tom Cruise deep fake, you might have seen. The piece was to make a tribute from Ella Black. The singer who is famous for his song, I Need a Dollar to his close friend Aviche, for one of the most famous Aviche songs, Wake Me Up. And when we were making this song, we had to convert it into five different languages, so from different speakers, into Ella Black voice and it should sound smooth. At some point you start thinking about lyrics and it says, “Wake me up when it's all over.” So it was, it was quite a moment when, when you, you understand it, how it resonates. Sometimes we rent a cinema, a small cinema in Kiev in Ukraine, to be able to group together and see the piece we were part of.
Alex Serdiuk: And when you see, when you hear the work that has been delivered by us on a big screen in a big movie like Mandalorian or Book of Boba Fett that's been watched by millions of fans who were not expecting for a character that's been with them for 30, 40 years, since their childhood actually, uh, to appear again. And it's that moment when you are in charge of bringing history back to life. Bringing some moments of joy for millions of people back. And then you go and you start reading comments. You start seeing reviews, and then you see how a grown man cries when he sees Luke. That's extraordinary. And all the time when something with our work goes out on a scale of like a triple A film or, or game, we try to make a small party. We listen carefully because often we provide just raw files and then post editing is on the team that does post editing in the studio, and that's the first time when we hear the result. And then we are moving towards to the world where content makers start to compete with their creative ideas, with their approach, to the way how they're doing projects—not with their budgets.
Alex Serdiuk: And that's the world I, I'd like to live in. Technology like ours, you can create indistinguishable speech. That's why we had to build quite strict ethics policies from the very beginning. So we always should have permission from target voice, but also we do allocate quite a lot of resources and time in several directions of protecting the general society from misuses of technologies like ours. Like detecting and creating technologies that would be able to detect synthetic speech watermarking and technologies that would be able to tell a particular, say, Respeecher generated content from any other content, and bringing overall awareness about the tools to the level where many people know about the fact that voice can be manipulated and soon might be manipulated by different bad players.
Alex Serdiuk: So it's, in my view, the most important piece is about bringing awareness. Like we used to be scared of Photoshop. We thought that all the photos now will be manipulated and all the Photoshop applications would be porn applications. But it turned out that most of the images on the internet are real, and Photoshop is not that much used for porn manipulation. Same with other different technologies, including synthetic media, but it can also be used by bad players. And then we start treat information we consume differently, and that's the end goal. We need to be reasonably skeptical about the information we receive. We shouldn't believe in everything we receive. And technologies would catch up. They, they would help us detect, they would help us mark. But it's all about our perception.
Jennifer: Do you have a story to tell? Know someone who does? Drop us an email at podcasts at technology review dot com. You can find links to our reporting in the show notes and you can support our journalism by going to techreview dot com slash subscribe.
Jennifer: This project was produced by me with Anthony Green and Emma Cillekens. We’re edited by Mat Honan and our mix engineer is Garret Lang.
Thanks for listening, I’m Jennifer Strong.
A Roomba recorded a woman on the toilet. How did screenshots end up on Facebook?
Robot vacuum companies say your images are safe, but a sprawling global supply chain for data from our devices creates risk.
The viral AI avatar app Lensa undressed me—without my consent
My avatars were cartoonishly pornified, while my male colleagues got to be astronauts, explorers, and inventors.
Roomba testers feel misled after intimate images ended up on Facebook
An MIT Technology Review investigation recently revealed how images of a minor and a tester on the toilet ended up on social media. iRobot said it had consent to collect this kind of data from inside homes—but participants say otherwise.
How to spot AI-generated text
The internet is increasingly awash with text written by AI software. We need new tools to detect it.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.