Nvidia just made it easier to build smarter chatbots and slicker fake news

Chip maker Nvidia is betting that AI’s language skills will advance rapidly—it’s releasing a powerful tool for putting together chatty programs.

Will Knightarchive page

August 13, 2019

An image of a voice assistant device next to sound wavesMs Tech; original image: HARMON KARDON

Artificial intelligence has made impressive strides in the last decade, but machines are still lousy at comprehending language. Just try engaging Alexa in a bit of witty banter.

Nvidia, the company that makes the computer chips that power many AI algorithms, thinks this is about to change and is looking to capitalize on an anticipated explosion.

Software the chip maker is releasing makes it easier to build AI programs on its hardware that are capable of using language more gracefully. The new code could accelerate the development of new language algorithms, and make chatbots and voice assistants snappier and smarter.

Nvidia already makes the most popular chips for training deep-learning AI models, which are proficient at tasks like image classification. Traditionally, though, it's been lot harder to apply statistical machine-learning methods like deep learning to the written or spoken word, because language is so ambiguous and complex.

But there’s been some significant headway lately. Two new deep-learning approaches to language from Google, known as Transformer and BERT, have proved especially adept at translating between languages, answering questions about a piece of text, and even generating realistic-looking text. This has sparked an uptick in academic and industry interest in advancing language using machine learning.

“The combination of Transformer and BERT has been massively impactful,” says Alexander Rush, a professor at Harvard University who specializes in the subfield of AI known as natural-language processing (NLP). “It’s basically state-of-the-art in every benchmark, and allows an undergrad to produce world-class models in five lines of code.”

Nvidia has been adept at chasing the latest trends in AI research. If its latest hunch proves correct, then voice assistants might go from merely responding to barked commands to stringing more words together coherently. Chatbots, meanwhile, may become less dimwitted, while the autocomplete feature found many in programs and apps might start suggesting whole paragraphs instead of just the next few words.

“We’ve got a lot of demand for language modeling,” says Bryan Catanzaro, VP for applied deep learning at Nvidia. “And if you look at the pace of language progress, it seems like an obvious place for us to make investments.”

Nvidia developed its software by optimizing numerous parts of the process used to train language models on its GPUs. This sped up the training of AI models (from several days to less than an hour), accelerated the performance of trained language models (from 40 milliseconds to just over 2 milliseconds), and allowed much larger language models to be trained (Nvidia’s language model, called Megatron, is many times larger than anything previously made, with 8.6 billion parameters).

Autocomplete no evil

Advances in language may have a darker side, though. Smarter algorithms could be used to mass-produce more convincing, tailored fake reviews, social-media posts, and news stories. Other research groups have shown how powerful language models can gin up realistic-looking text after ingesting large swaths of writing from the internet.

Nvidia has a simple plan to prevent potential misuse: it won’t release the largest language model it has developed, and plans to rely on researchers to use its tools with care. “We are releasing code that shows how to use GPUs to train these large models,” Catanzaro says. “We believe the community will use this code responsibly, but keep in mind that training models of this size requires serious computing power, which puts it out of reach for most people.”

Even if progress continues apace, it’s likely to be a long while before machines can really converse with us. Language is deceptively difficult for machines to make sense of, in part because of its compositional complexity: words can be rearranged to unlock infinite meaning. Grasping the meaning of a phrase often also requires some sort of common-sense understanding of the world—something computers don’t have.

“We’re seeing a renaissance in NLP capabilities,” says Oren Etzioni, CEO of the Allen Institute for Artificial Intelligence (Ai2), a nonprofit in Seattle dedicated to cutting-edge AI research. This will translate to better chatbots and voice assistants, he says, although they will suffer from a lack of common sense. “A voice assistant that is as helpful as a skillful hotel concierge is still beyond the horizon,” Etzioni says.

Ai2 recently launched a tool, called Grover, that uses NLP advances to catch text that seems to have been churned out by AI. Etzioni points out that bots already deceive people on Facebook and Twitter. “Automatically generated fake text is already here,” he says, “and is likely to increase exponentially.”

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.