Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo

 

Unsupported browser: Your browser does not meet modern web standards. See how it scores »

{ action.text }

For tens of millions of people around the world-from West Africa to Southeast Asia to the Middle East-the Internet’s not such a friendly place. That’s because many of the world’s writing systems still aren’t encoded in software, which means millions of people can’t write e-mail, build Web sites, or search databases in their native scripts. A group of linguists at the University of California, Berkeley, is trying to change that, by making sure that nearly 100 additional scripts have a place in a crucial international standard that lets computers render, process, and send text data.

The university’s initiative “is an effort to rectify an oft-overlooked aspect of the digital divide: many scripts used by languages of under five million speakers in the world today are not represented in the international standard,” says Deborah Anderson, a linguist at Berkeley who leads the effort. That standard is called Unicode, which assigns a unique ID number to every written character, symbol, and punctuation mark in a written language. The ID numbers mean that characters won’t get misinterpreted as data move between software programs or across the Internet-a problem that sometimes shows up as a string of question marks on your screen and can cripple the ability of whole populations to communicate via the Internet. For example, Unicode is enabling radical economic transformations in Vietnam. Before this year, computer and software manufacturers had come up with 43 different ways to encode Vietnamese text, which meant computers couldn’t reliably swap data. Then, early this year, the Vietnamese government adopted Unicode as its national standard.

The problem is that the more obscure writing systems are not yet encoded in the Unicode standard. Adding another 100 scripts is a big task; only 52 are encoded today. To do the job, Berkeley is recruiting and funding linguists, as well as users of scripts like N’Ko (used in West Africa), Balinese (used in Indonesia), and Tifinagh (used in parts of Northern Africa), to determine how many characters each script contains, design fonts, and guide proposals through a bureaucratic maze of government agencies and computer standards bodies. The benefit will be visible to Internet users like Mamady Doumbouya, a Philadelphia publisher who would be able to offer an online version of his newspaper in N’Ko for the first time. “Without Unicode, it takes so much to set up your computer to read a newspaper in N’Ko,” Doumbouya says.

Such changes won’t happen overnight. Anderson estimates that the project, launched last year, will take 10 years to complete. Until recently, computer companies sustained the encoding effort, but their interest is dwindling because users of unencoded alphabets represent too small a market. The Berkeley project is part of a larger effort to make the Internet more globally available; already the World Wide Web Consortium has made it possible to register domain names in these new scripts, meaning, among other things, that the URLs of Web sites can reflect the writing systems of the people who own them.

U.S. national security experts are interested, too. Everette Jordan, head of the National Virtual Translation Center, a newly formed U.S. government office that provides foreign-language resources for the intelligence community, points out that “technologically, we’re deaf, dumb, and blind if we can’t read this stuff.” Soon, though, U.S. security agencies and African newspaper publishers alike could rally to a new standard.

0 comments about this story. Start the discussion »

Tagged: Web

Reprints and Permissions | Send feedback to the editor

From the Archives

Close

Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me