Skip to Content

The Online Language Barrier

Efforts to bring Internet connectivity to developing countries don’t address the fact that little content is available in local languages.
March 6, 2015

No matter how many Facebook drones and Google balloons take to the skies in the next few years, they can’t change the fact that the vast majority of Internet content is available in only a few languages. The dominance of those tongues may limit the appeal of the Internet to newcomers—or accelerate their adoption of new languages.

The World Bank estimates that 80 percent of online content is available in only one of 10 languages: English, Chinese, Spanish, Japanese, Arabic, Portuguese, German, French, Russian, and Korean.

Roughly three billion people around the world speak one of those as their first language. But over half of all online content is written in English, which is understood by just 21 percent of the world, according to estimates by Web browser maker Mozilla and the mobile industry trade group GSMA.

Mozilla and GSMA estimate that Hindi, the first language of roughly 260 million people, constitutes less than 0.1 percent of all Web content. According to a recent estimate by the U.N., only 3 percent of online content is in Arabic, which is the primary language of around 240 million people.

Part of the challenge is that there is almost twice the language diversity in the developing world, which accounts for 94 percent of the world’s offline population, as there is in the developed world. In India, for example, there are around 425 different languages.

Wikipedia highlights the English-centric nature of the Internet. There are more than four million Wikipedia articles in English, and no other language is represented by more than two million articles. Only 15 other languages are used in more than 500,000 articles, and 7,002 languages don’t appear at all.

Keep Reading

Most Popular

transplant surgery
transplant surgery

The gene-edited pig heart given to a dying patient was infected with a pig virus

The first transplant of a genetically-modified pig heart into a human may have ended prematurely because of a well-known—and avoidable—risk.

open sourcing language models concept
open sourcing language models concept

Meta has built a massive new language AI—and it’s giving it away for free

Facebook’s parent company is inviting researchers to pore over and pick apart the flaws in its version of GPT-3

Muhammad bin Salman funds anti-aging research
Muhammad bin Salman funds anti-aging research

Saudi Arabia plans to spend $1 billion a year discovering treatments to slow aging

The oil kingdom fears that its population is aging at an accelerated rate and hopes to test drugs to reverse the problem. First up might be the diabetes drug metformin.

images created by Google Imagen
images created by Google Imagen

The dark secret behind those cute AI-generated animal images

Google Brain has revealed its own image-making AI, called Imagen. But don't expect to see anything that isn't wholesome.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.