In May, Google made quite the splash when it unveiled Duplex, its eerily humanlike voice assistant capable of making restaurant reservations and salon appointments. It seemed to mark a new milestone in speech generation and natural-language understanding, and it pulled back the curtain on what the future of human-AI interaction might look like.
But while Google slowly rolls out the feature in a limited public launch, Alibaba’s own voice assistant has already been clocking overtime. On December 2 at the 2018 Neural Information Processing Systems conference, one of the largest annual gatherings for AI research, Alibaba demoed the AI customer service agent for its logistics company Cainiao. Jin Rong, the dean of Alibaba’s Machine Intelligence and Technology Lab, said the agent is already servicing millions of customer requests a day.
The pre-recorded demo call involved the agent asking a customer where he wanted his package delivered. In the back-and-forth exchange, the agent successfully navigated several conversational elements that demonstrated the breadth of its natural-language capabilities.
Take this exchange at the beginning of the call, translated from Mandarin:
Agent: Hello, I am Cainiao’s voice assistant. I am—
A: Yes, hi, you have package scheduled for morning delivery to 588 Culture West Road. Is it convenient for you to receive?
C: Who are you?
A: I am Cainiao’s voice assistant. I’d like to confirm your morning delivery to 588 Culture West Road. Does that work for you?
C: I’m not home in the morning.
A: Then do you have another address that you’d like to use?
Within 30 seconds, the agent has smoothly handled three common, and tricky, conversational ingredients: interruption, nonlinear conversation, and implicit intent. Interruption is self-explanatory: the agent can respond to the customer’s interruption and continue relaying relevant information without starting over or skipping a beat.
The nonlinear conversation occurs when the customer asks, “Who are you?” This requires the agent to register that the customer is not answering the preceding question but rather starting a new line of inquiry. In response, the agent reintroduces itself before returning to the original question.
The implicit intent occurs when the customer responds, “I’m not home in the morning.” He never explicitly says what he actually means—that home delivery won’t work—but the agent is able to read between the lines and follow up sensibly.
These elements may be boringly commonplace in human conversations, but machines often struggle to handle them. That Alibaba’s voice assistant can do so suggests it’s more sophisticated than Google Duplex, judging from similar sample calls demoed by Google. It’s worth noting, however, that Alibaba’s demo call is designed for onstage presentation; the experience could differ in reality.
Currently, the agent is used only to coordinate package deliveries, but Jin said it could be expanded to handle other topics. He wouldn’t fully reveal how the assistant was trained. But he alluded to using the massive number of customer recordings at the company’s disposal, in addition to other resources. On a typical day the company averages 50,000 customer service calls, according to the presentation slides—a number that quintuples for Singles’ Day (November 11), its highest revenue-generating holiday of the year.
Alibaba is also developing digital assistants for other aspects of its business, including a food-ordering agent that can take your order in noisy restaurants and stores; a humanlike virtual avatar that can field questions about Alibaba products; and a price-haggling chatbot that is already used by 20% of sellers on Alibaba’s resale platform Xianyu.
At their core, each of these assistants is powered by the speech-recognition and natural-language-processing engine called AliMe, developed by the company’s Machine Intelligence and Technology Lab. They are then packaged and adapted to different parts of the business.
Alibaba’s biggest advantage in this field is the overwhelming wealth of data it has to train its AI. The assistants learn and improve faster because of the amount of practice they get in handling all kinds of situations. A huge business incentive to deploy these technologies quickly also helps. In addition to handling a high volume of customer support calls, Alibaba delivers one billion packages per day. Offloading certain tasks to AI helps alleviate the burden on humans and keep the business running smoothly.
DeepMind’s cofounder: Generative AI is just a phase. What’s next is interactive AI.
“This is a profound moment in the history of technology,” says Mustafa Suleyman.
AI hype is built on high test scores. Those tests are flawed.
With hopes and fears about the technology running wild, it's time to agree on what it can and can't do.
You need to talk to your kid about AI. Here are 6 things you should say.
As children start back at school this week, it’s not just ChatGPT you need to be thinking about.
AI language models are rife with different political biases
New research explains you’ll get more right- or left-wing answers, depending on which AI model you ask.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.