Tiny AI models could supercharge autocorrect and voice assistants on your phone

Karen Haoarchive page

October 4, 2019

Illustration of a cell phone with a talkative voice assistantMs. Tech

Researchers have successfully shrunk a giant language model to use in commercial applications.

Who’s counting? In the past year, natural language models have become dramatically better at the expense of getting dramatically bigger. In October of last year, for example, Google released a model called BERT that passed a long-held reading-comprehension benchmark in the field. The larger version of the model had 340 million data parameters, and training it just one time through cost enough electricity to power a US household for 50 days.

Four months later, OpenAI quickly topped it with its model GPT-2. The model demonstrated an impressive knack for constructing convincing prose; it also used 1.5 billion parameters. Now, MegatronLM, the latest and largest model from Nvidia, has 8.3 billion parameters. (Yes, things are getting out of hand.)

The big, the bad, the ugly: AI researchers have grown increasingly worried about the consequences of this trend. In June, a group at the University of Massachusetts, Amherst, showed the climate toll of developing and training models at such a large scale. Training BERT, they calculated, emitted nearly as much carbon as a round-trip flight between New York and San Francisco; GPT-2 and MegatronLM, by extrapolation, would likely emit a whole lot more.

The trend could also accelerate the concentration of AI research into the hands of a few tech giants. Under-resourced labs in academia or countries with fewer resources simply don’t have the means to use or develop such computationally expensive models.

Honey, I shrunk the AI: In response, many researchers are focused on shrinking the size of existing models without losing their capabilities. Now two new papers, released within a day of one another, have successfully done that to the smaller version of BERT, with 100 million parameters.

The first paper, from researchers at Huawei, produces a model called TinyBERT that is less than a seventh the size of the original and nearly 10 times faster. It also performs nearly as well in language understanding as the original. The second, from researchers at Google, produces another that is smaller by a factor of more than 60, but its language understanding is slightly worse than the Huawei version.

How they did it: Both papers use variations of a common compression technique known as knowledge distillation. It involves using the large AI model that you want to shrink (the “teacher”) to train a much smaller model (the “student”) in its image. To do so, you feed the same inputs into both and then tweak the student until its outputs match the teacher’s.

Outside of the lab: In addition to improving access to state-of-the-art AI, tiny models will help bring the latest AI advancements to consumer devices. They avoid the need to send consumer data to the cloud, which improves both speed and privacy. For natural-language models specifically, more powerful text prediction and language generation could improve myriad applications like autocomplete on your phone and voice assistants like Alexa and Google Assistant.

To have more stories like this delivered directly to your inbox, sign up for our Webby-nominated AI newsletter The Algorithm. It's free.

Deep Dive

Artificial intelligence

How to opt out of Meta’s AI training

Your posts are a gold mine, especially as companies start to run out of AI training data.

Melissa Heikkiläarchive page

Apple is promising personalized AI in a private cloud. Here’s how that will work.

Apple’s first big salvo in the AI wars makes a bet that people will care about data privacy when automating tasks.

James O'Donnellarchive page

Why does AI hallucinate?

The tendency to make things up is holding chatbots back. But that’s just what they do.

Will Douglas Heavenarchive page

This AI-powered “black box” could make surgery safer

A new smart monitoring system could help doctors avoid mistakes—but it’s also alarming some surgeons and leading to sabotage.

Simar Bajajarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Tiny AI models could supercharge autocorrect and voice assistants on your phone

Deep Dive

Artificial intelligence

How to opt out of Meta’s AI training

Apple is promising personalized AI in a private cloud. Here’s how that will work.

Why does AI hallucinate?

This AI-powered “black box” could make surgery safer

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

Deep Dive

Artificial intelligence

How to opt out of Meta’s AI training

Apple is promising personalized AI in a private cloud. Here’s how that will work.

Why does AI hallucinate?

This AI-powered “black box” could make surgery safer

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review