Building customer relationships with conversational AI

Artificial intelligence has developed to the point where chatbots and virtual assistants can have more nuanced interactions with humans—and that opens a wealth of possibilities.

MIT Technology Review Insightsarchive page

March 29, 2021

In association withSalesforce

We’ve all been there. “Please listen to our entire menu as our options have changed. Say or press one for product information...” Sometimes, these automated customer service experiences are effective and efficient—other times, not so much.

Many organizations are already using chatbots and virtual assistants to help better serve their customers. These intelligent, automated self-service agents can handle frequently asked questions, provide relevant knowledge articles and resources to address customer inquiries, and help customers fill out forms and do other routine procedures. In the case of more complex inquiries, these automated self-service agents can triage those requests to a live human agent.

During times of uncertainty and emergency, customer service operations powered by artificial intelligence (AI) can be invaluable to businesses, helping customer service or human resources call centers keep up with spikes in demand and reduce customer wait times and frustration. According to recent estimates, Gartner predicts that by 2022, 70% of customer interactions will involve emerging technologies such as machine learning applications, chatbots, and mobile messaging. That’s an increase of 15% from 2018.

“In these types of conversational interactions, AI chatbots can extend the reach of an organization’s customer service and maintain a level of reciprocity with their customers,” says Greg Bennett, conversation design principal at Salesforce. “There’s also the opportunity for the business to express its brand, its voice, and its tone through words and language it uses to create a greater degree of intimacy.” Bennett is deeply involved in training AI systems that power conversational chatbots and ensuring they are inclusive and able to understand a broad range of dialects, accents, and other linguistic expressions.

Not only is the use of AI automation becoming more widespread, it is also proving to be a significant business driver. Gartner anticipates that in 2021, AI augmentation will generate $2.6 trillion in business value. It could also save as many as 6.2 billion hours of labor.

Conversational intelligence defined

According to research conducted by management consultancy Korn Ferry, conversational intelligence is a collaborative effort. And that collaborative effort is reciprocity of two participants to communicate in ways that lead to a shared concept of reality. That closes the gap between the individual reality of the two speakers—and helps businesses help customers.

With that in mind, Salesforce and other companies have taken that concept one step further by looking for ways to combine conversational intelligence with technology. In fact, through these efforts, AI-powered conversational intelligence has vastly improved over time. This started with simple text recognition in which it’s fairly easy to achieve a significant degree of accuracy. But text recognition can be somewhat two-dimensional, which is why research has progressed to include automated speech recognition. Automated speech recognition systems must account for different languages, accents, and acoustic inflections, which is much more difficult and nuanced. As AI algorithms have become more sophisticated and have had the time and experience to incorporate more linguistic variations, AI technology has improved its ability to accurately understand the deeper subtleties of human conversational interactions.

“Conversational intelligence is the constellation of features and technologies that enable humans and machines to take turns exchanging language and work toward accomplishing a discursive goal,” says Bennett.

These AI systems focused on linguistics use a number of different technologies to understand written and spoken interactions with humans. Some of these include the following:

Automated speech recognition, which is used to understand spoken language for voice systems;
Natural language processing, which helps computers understand, interpret, and analyze spoken and written language; and
Natural language understanding, which makes it possible for AI to understand intent.

Going well beyond simple text recognition, natural language understanding is where AI is truly bringing its strengths to bear. By facilitating deeper, more nuanced conversation, it increases the efficacy of human-AI interactions. When an AI-powered customer service system is better equipped to recognize and discern natural language with fewer errors, it can guide a customer through an entire interaction without having to engage a human service agent. This frees up the agents to focus on more complex cases.

And using these capabilities in customer service environments can help companies not only expedite and improve interactions with their customers but also improve the overall customer relationship. “If we can have a machine that helps facilitate that type of interaction between a company and a customer, then it helps to further build a relationship with that customer in a way that a help article would not,” says Bennett.

And the more an AI system engages with humans, the more effective its algorithms become. By interacting with humans, an AI system can gather the data required to improve natural language understanding to better understand intent, helping to facilitate more nuanced human-computer conversations. Human interaction also helps these AI systems improve recognition and predictive capabilities to deliver more personalized content. By learning the many ways people behave and interact, the system’s response becomes more accurate.

AI algorithms absorb, process, and analyze the data sets fed into the system using their own specific equations. This processing is done in one of two basic modalities: supervised or unsupervised. In supervised improvement, data sets will have an assigned target value or category. In unsupervised improvement, the algorithm analyzes the dataset on its own with no guidance or restrictions.

As they receive and process more data, the algorithms evolve, adapt, and improve their analytical models. So the algorithms improve and refine themselves based on both the quality and quantity of data processed. “There are notions that AI can glean distinct intent, scope, and context by interacting with humans,” says Bennett. “These incremental improvements in predictive ability and depth of understanding increase the efficiency of customer engagement.”

Appreciating linguistic challenges

Although natural language processing has come a long way, automated speech recognition technology continues to face challenges in recognizing the full range of linguistic variations. “There are all these different English accents, all of them are robust and valid and should be celebrated,” says Bennett. Other linguistic variations that challenge AI include different slang or colloquial expressions to convey similar meanings and other paralinguistic features like tone, intonation, pacing, pausing, and pitch.

It is paramount to help AI manage the inherent levels of bias present in the system and expand to recognize the full range of linguistic variations. These incremental improvements in the predictive ability of AI algorithms help improve the customer experience by reducing the amount of back-and-forth exchanges and moments of frustration brought on by a lack of accurate recognition.

But these efforts and advancements present certain ethical conundrums. Consider, for example, how minorities are represented in training datasets—or more accurately how they are not represented. Most widely used datasets exclude more diverse expressions of dialect and social identity. Ensuring a diverse representation on the teams developing AI technologies is a critical step toward developing and evolving AI algorithms to recognize a broader array of linguistic expressions.

Now that AI is capable of allowing for a greater degree of variation, it should be able to account for broader contextual relevance and be more inclusive. Although conversation and language are the conduit, it is incumbent on humans working with AI systems to continue to consider accessibility throughout dialects, accents, and other stylistic variations.

“Under-represented minorities have very little representation of their dialect and the expression of their social identity through language in these systems. It’s mostly because of their lack of representation among the teams creating the technology,” says Bennett. Ensuring that companies developing and deploying AI systems bring more diverse teams into the mix can help resolve that inherent bias.

AI systems have the capacity to allow for a greater degree of variation. When the systems can accurately interpret those variations and generate a contextually relevant response, AI will have evolved to a greater degree than ever before. “That’s really where I think the evolution [of the field] has taken us,” Bennett says.

Of course, that’s not to say there aren’t other ethical and practical concerns surrounding the expanded use of AI. Privacy concerns, responsibility, transparency, and accurately and appropriately delegating decision processes are all still relevant. And then there’s the ethical use of voice recordings. It’s a growing field in which significant parameters still need to be defined.

Forging a deeper human-AI connection

Addressing the full range of linguistic variations and including more diverse groups and historically under-represented minorities in the process is truly building the future of the human-AI connection. This will also lead to more widespread use cases for business. In fact, the biggest competitive differentiator in the future of conversational technology will be the ability to provide robust conversational understanding regardless of language, accent, slang, dialect, or other aspects of social identity.

Bennett recalls a lesson from a grad school professor: “She said, ‘Having a conversation is like climbing a tree that climbs back.’ And that really characterizes the trajectory of where conversational AI technologies must go in order to meet the human needs and standards of conversation as a behavioral practice.” Conversation is not a solo act. It’s a two-way street. True conversation is the act—some might even say the art—of taking turns engaging in speaking and listening, exchanging ideas, exchanging feelings, and exchanging information.

“In linguistics, the paralinguistic features of speech like inflection, intonation, pacing, pausing, and pitch provide the pragmatic layer of meaning to a conversation,” says Bennett. “Instead of focusing on how the users can help AI systems, we should be asking how we can scale the system to meet the users where they are. Given what we know about linguistics, I don’t believe you can force any sort of language change,” he says. “Conversational AI technology is set up in a way that could succeed if we took that approach at the pragmatic layer—the paralinguistic side of things.”

“The capacity to comprehend, fully understand, and scale to that level of linguistic diversity is where AI is heading,” says Bennett. “Startups in the conversational AI space are indexing on that as a differentiating factor. And when you think about it, if you include more diverse groups and historically under-represented minorities in the process, that actually expands your total addressable market.”

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.