We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.


Why Talking Computers Are Tough to Listen To

The subtle social and emotional cues in our voices are vital to how we communicate—and making a computer reproduce them is really hard.

Maybe you watched the unveiling of IBM’s Watson live on Jeopardy! in 2009. Or perhaps you caught the tech firm’s latest ad campaign on TV, which features goofy dialogues between Watson and Serena Williams, Richard Thaler, or Bob Dylan.

Even if not, chances are you’ve interacted with a talking computer at some point. But creating a convincing talking computer is actually really hard. In an interesting story in the New York Times on Monday, tech writer John Markoff discussed the effort that went into creating the voice for IBM’s Watson and used that as a way into a discussion of the efforts under way to create more natural and acceptable computer voices.

This is one of the fascinating challenges of human-computer interaction: social and emotional cues are vitally important when it comes to vocal communications. It’s not only jarring if the voice of an assistant such as Apple’s Siri or Amazon’s Alexa sounds unnatural. It can also be vexing when such a system fails to recognize your tone and modulate its own voice accordingly. After you ask the same question with increasing frustration, for instance, it feels like an affront for an artificial voice to continually produce the same deadpan response.

A little while after Siri came out, I wrote about the importance of trying to capture humor for creating something capable of entertaining users while avoiding annoying them. Indeed, the need to fit artificial intelligence into an existing social framework may explain why we find it necessary to assign characteristics such as gender to even fictional robots. Perhaps this even explains why Apple recently acquired Emotient, a company that focuses on reading and responding to human emotions.

Joaquin Phoenix falls in love with a computer in the movie "Her."

It’s also interesting to consider the potential of truly engaging, emotionally powerful computer interfaces, of the kind portrayed so well in the Spike Jonze movie Her. But it’s still very difficult to decode and mimic all of the subtleties of human communication. As Michael Picheny, a senior manager at the Watson Multimodal Lab for IBM Research, says in the NYT piece: “A good computer-machine interface is a piece of art, and should be treated as such.”

(Source: New York Times)

Tech Obsessive?
Become an Insider to get the story behind the story — and before anyone else.

Subscribe today
Joaquin Phoenix falls in love with a computer in the movie "Her."
More from Connectivity

What it means to be constantly connected with each other and vast sources of information.

Want more award-winning journalism? Subscribe to Insider Basic.
  • Insider Basic {! insider.prices.basic !}*

    {! insider.display.menuOptionsLabel !}

    Six issues of our award winning print magazine, unlimited online access plus The Download with the top tech stories delivered daily to your inbox.

    See details+

    What's Included

    Unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Bimonthly print magazine (6 issues per year)

You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.