Skip to Content

The Translation Challenge

Software based on rules, examples, or statistics seeks to erase language barriers. It’s far from perfect, but sometimes close is good enough.

In the early, post-World War II days of computing, scientists dreamed of creating software so intelligent it could accurately translate one language into another. If computers could crack enemy codes, the thinking went, then why not foreign languages? Five decades later, researchers are still working on the problem. But what was a dream in the 1950s has become an overwhelming demand as business increasingly ignores traditional borders. In fact, by 2007, the already huge translation market-mostly manual today-is expected to reach $13 billion, as advertising, Web pages, and even internal company documents have to be tailored to different countries. “There’s so much more text than anybody could hope to translate by hand,” says Stephen Richardson, a senior researcher at Microsoft Research, “and it’s growing exponentially.”

Creating effective translation software requires solving many of the same problems that natural-language processing faces, and then some. Both systems must determine from context whether “light bulb,” for example, means a source of light or a less-than-heavy plant bulb. But while customer service software at, say, General Electric could be fairly sure of its interpretation, translation software might have to handle treatises on horticulture as well. Translation software must also contend with idioms whose figurative meaning has nothing to do with their literal meaning-and then find parallel idioms in a different language. It’s a problem that makes word-for-word translation impracticable. 

Researchers are making progress today using three basic approaches drawn from natural-language processing. Knowledge-based machine translation, for example, relies on human programmers to write lists of rules that describe all possible relationships between verbs, nouns, prepositions, and so on for each language. To generate a sensible translation, the software then scours those rules to find matching words and relationships in the target language. IBM employs knowledge-based software in its WebSphere Translation Server, which the company and its global customers use to rapidly translate e-mail, Web sites, and even real-time chats.

A second approach, example-based systems, relies chiefly on raw computing power. Software algorithms search through millions of words and phrases in documents that already have been accurately translated into multiple languages, compare the translations, and then create an enormous database of vocabulary and word relationships to resolve ambiguity for future translations.

Statistical techniques also depend on computing power to compare reams of previously translated text. However, this strategy selects the most likely translation using sophisticated mathematical models that the software continually upgrades based on how often its interpretations prove accurate. If it correctly translates “light bulb,” then the probability score assigned to that phrase for future translations goes up. Microsoft used a combination of example-based and statistical techniques to automatically translate its entire 60-million-word product-support-services knowledge base from English to Spanish.

With accuracy rates hovering between 70 and 80 percent, no system is foolproof. But, says Michael McCord of IBM’s Language Analysis and Translation group in Yorktown Heights, NY, translation software significantly reduces the time it takes for humans to translate documents, and sometimes an approximate translation will do. Meanwhile, accuracy rates keep inching up. “More work, bigger machines, more people. We’ll get better,” he says.

See Computers that Speak Your Language

Keep Reading

Most Popular

Workers disinfect the street outside Shijiazhuang Railway Station
Workers disinfect the street outside Shijiazhuang Railway Station

Why China is still obsessed with disinfecting everything

Most public health bodies dealing with covid have long since moved on from the idea of surface transmission. China’s didn’t—and that helps it control the narrative about the disease’s origins and danger.

individual aging affects covid outcomes concept
individual aging affects covid outcomes concept

Anti-aging drugs are being tested as a way to treat covid

Drugs that rejuvenate our immune systems and make us biologically younger could help protect us from the disease’s worst effects.

Europe's AI Act concept
Europe's AI Act concept

A quick guide to the most important AI law you’ve never heard of

The European Union is planning new legislation aimed at curbing the worst harms associated with artificial intelligence.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.