Why Talking Computers Are Tough to Listen To
Maybe you watched the unveiling of IBM’s Watson live on Jeopardy! in 2009. Or perhaps you caught the tech firm’s latest ad campaign on TV, which features goofy dialogues between Watson and Serena Williams, Richard Thaler, or Bob Dylan.
Even if not, chances are you’ve interacted with a talking computer at some point. But creating a convincing talking computer is actually really hard. In an interesting story in the New York Times on Monday, tech writer John Markoff discussed the effort that went into creating the voice for IBM’s Watson and used that as a way into a discussion of the efforts under way to create more natural and acceptable computer voices.
This is one of the fascinating challenges of human-computer interaction: social and emotional cues are vitally important when it comes to vocal communications. It’s not only jarring if the voice of an assistant such as Apple’s Siri or Amazon’s Alexa sounds unnatural. It can also be vexing when such a system fails to recognize your tone and modulate its own voice accordingly. After you ask the same question with increasing frustration, for instance, it feels like an affront for an artificial voice to continually produce the same deadpan response.
A little while after Siri came out, I wrote about the importance of trying to capture humor for creating something capable of entertaining users while avoiding annoying them. Indeed, the need to fit artificial intelligence into an existing social framework may explain why we find it necessary to assign characteristics such as gender to even fictional robots. Perhaps this even explains why Apple recently acquired Emotient, a company that focuses on reading and responding to human emotions.

It’s also interesting to consider the potential of truly engaging, emotionally powerful computer interfaces, of the kind portrayed so well in the Spike Jonze movie Her. But it’s still very difficult to decode and mimic all of the subtleties of human communication. As Michael Picheny, a senior manager at the Watson Multimodal Lab for IBM Research, says in the NYT piece: “A good computer-machine interface is a piece of art, and should be treated as such.”
(Source: New York Times)
Keep Reading
Most Popular
DeepMind’s cofounder: Generative AI is just a phase. What’s next is interactive AI.
“This is a profound moment in the history of technology,” says Mustafa Suleyman.
What to know about this autumn’s covid vaccines
New variants will pose a challenge, but early signs suggest the shots will still boost antibody responses.
Human-plus-AI solutions mitigate security threats
With the right human oversight, emerging technologies like artificial intelligence can help keep business and customer data secure
Next slide, please: A brief history of the corporate presentation
From million-dollar slide shows to Steve Jobs’s introduction of the iPhone, a bit of show business never hurt plain old business.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.