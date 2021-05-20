In December, Google ousted its ethical AI co-lead Timnit Gebru after she refused to retract a paper that made many of these points. A few months later, after wide-scale denunciation of what an open letter from Google employees called the company’s “unprecedented research censorship,” it fired Gebru’s coauthor and co-lead Margaret Mitchell as well.

It’s not just Google that is deploying this technology. The highest-profile language models so far have been OpenAI’s GPT-2 and GPT-3, which spew remarkably convincing passages of text and can even be repurposed to finish off music compositions and computer code. Microsoft now exclusively licenses GPT-3 to incorporate into yet-unannounced products. Facebook has developed its own LLMs for translation and content moderation. And startups are creating dozens of products and services based on the tech giants’ models. Soon enough, all of our digital interactions—when we email, search, or post on social media—will be filtered through LLMs.

Unfortunately, very little research is being done to understand how the flaws of this technology could affect people in real-world applications, or to figure out how to design better LLMs that mitigate these challenges. As Google underscored in its treatment of Gebru and Mitchell, the few companies rich enough to train and maintain LLMs have a heavy financial interest in declining to examine them carefully. In other words, LLMs are increasingly being integrated into the linguistic infrastructure of the internet atop shaky scientific foundations.

More than 500 researchers around the world are now racing to learn more about the capabilities and limitations of these models. Working together under the BigScience project led by Huggingface, a startup that takes an “open science” approach to understanding natural-language processing (NLP), they seek to build an open-source LLM that will serve as a shared resource for the scientific community. The goal is to generate as much scholarship as possible within a single focused year. Their central question: How and when should LLMs be developed and deployed to reap their benefits without their harmful consequences?

“We can’t really stop this craziness around large language models, where everybody wants to train them,” says Thomas Wolf, the chief science officer at Huggingface, who is co-leading the initiative. “But what we can do is try to nudge this in a direction that is in the end more beneficial.”

Stochastic parrots

In the same month that BigScience kicked off its activities, a startup named Cohere quietly came out of stealth. Started by former Google researchers, it promises to bring LLMs to any business that wants one—with a single line of code. It has developed a technique to train and host its own model with the idle scraps of computational resources in a data center, which holds down the costs of renting out the necessary cloud space for upkeep and deployment.

Among its early clients is the startup Ada Support, a platform for building no-code customer support chatbots, which itself has clients like Facebook and Zoom. And Cohere’s investor list includes some of the biggest names in the field: computer vision pioneer Fei-Fei Li, Turing Award winner Geoffrey Hinton, and Apple’s head of AI, Ian Goodfellow.

Cohere is one of several startups and initiatives now seeking to bring LLMs to various industries. There’s also Aleph Alpha, a startup based in Germany that seeks to build a German GPT-3; an unnamed venture started by several former OpenAI researchers; and the open-source initiative Eleuther, which recently launched GPT-Neo, a free (and somewhat less powerful) reproduction of GPT-3.

But it’s the gap between what LLMs are and what they aspire to be that has concerned a growing number of researchers. LLMs are effectively the world’s most powerful autocomplete technologies. By ingesting millions of sentences, paragraphs, and even samples of dialogue, they learn the statistical patterns that govern how each of these elements should be assembled in a sensible order. This means LLMs can enhance certain activities: for example, they are good for creating more interactive and conversationally fluid chatbots that follow a well-established script. But they do not actually understand what they’re reading or saying. Many of the most advanced capabilities of LLMs today are also available only in English.