Companies that do business online are missing out on billions in annual sales thanks to a bug that is keeping their systems incompatible with Internet domain names made of non-Latin characters. Fixing it could also bring another 17 million people who speak Russian, Chinese, Arabic, Vietnamese, and Indian languages online.
Those are the conclusions of a new study by an industry-led group sponsored by the International Corporation for Assigned Names and Numbers (ICANN), the organization responsible for maintaining the list of valid Internet domain names. The objective of the so-called Universal Acceptance Steering Group, which includes representatives from a number of Internet companies including Microsoft and GoDaddy, is to encourage software developers and service providers to update how their systems validate the string of characters to the right of the dot in a domain name or e-mail address—also called the top-level domain.
The bug wasn’t an obvious problem until 2011, when ICANN decided to dramatically expand the range of what can appear to the right of the dot (see “ICANN’s Boondoggle”). Between 2012 and 2016, the number of top-level domains ballooned from 12 to over 1,200. That includes 100 “internationalized” domains that feature a non-Latin script or Latin-alphabet characters with diacritics, like an umlaut (¨), or ligatures, like the German Eszett (ß). Some 2.6 million internationalized domain names have been registered under the new top-level domains, largely concentrated in the Russian and Chinese languages, according to the new study.
Many Web applications or e-mail clients recognize top-level domains as valid only if they are composed of characters that can be encoded using American Standard Code for Information Interchange, or ASCII. The problem is most pronounced with e-mail addresses, which are required credentials for accessing online bank accounts and social media pages in addition to sending messages. In 2016, the group tested e-mail addresses with non-Latin characters to the right of the dot and found acceptance rates of less than 20 percent.
The bug fix, which entails changing the fundamental rules that validate domains so that they accept Unicode, a different standard for encoding text that works for many more languages, is relatively straightforward, says Ram Mohan, the steering group’s chair. The new research suggests that the potential economic benefits of making the fix outweigh the costs. Too many businesses, including e-commerce firms, e-mail services, and banks, simply aren’t yet aware that their systems don’t accept these new domains, says Mohan.
Things are improving, though. In 2014, Google updated Gmail to accept and display internationalized domain names without having to rely on an inconvenient workaround that translated the characters into ASCII. Microsoft is in the process of updating its e-mail systems, which include Outlook clients and its cloud-based service, to accept internationalized domain names and e-mail addresses.
It’s not just about the bottom line, says Mark Svancarek, a program manager for customer and partner experience at Microsoft, and a vice chair of the Universal Acceptance Steering Group. To let millions of people be held back from the Internet because “the character set is gibberish to them” is antithetical to his company’s mission, he says.
Acceptance of non-ASCII domains is likely to spur Internet adoption, since a large portion of the next billion people projected to connect to the Internet predominantly speak and write only in their local languages, says Mohan. Providing accessibility to these people will depend in many ways on the basic assumptions governing the core functions of the Internet, he says. “The problem here is that in some ways this is lazy programming, and because it’s lazy programming, it’s easy to replace it with better programming.”
This new data poisoning tool lets artists fight back against generative AI
The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models.
The Biggest Questions: What is death?
New neuroscience is challenging our understanding of the dying process—bringing opportunities for the living.
Rogue superintelligence and merging with machines: Inside the mind of OpenAI’s chief scientist
An exclusive conversation with Ilya Sutskever on his fears for the future of AI and why they’ve made him change the focus of his life’s work.
How to fix the internet
If we want online discourse to improve, we need to move beyond the big platforms.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.