Hello,

We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Rewriting Life

The Translation Challenge

Software based on rules, examples, or statistics seeks to erase language barriers. It’s far from perfect, but sometimes close is good enough.

In the early, post-World War II days of computing, scientists dreamed of creating software so intelligent it could accurately translate one language into another. If computers could crack enemy codes, the thinking went, then why not foreign languages? Five decades later, researchers are still working on the problem. But what was a dream in the 1950s has become an overwhelming demand as business increasingly ignores traditional borders. In fact, by 2007, the already huge translation market-mostly manual today-is expected to reach $13 billion, as advertising, Web pages, and even internal company documents have to be tailored to different countries. “There’s so much more text than anybody could hope to translate by hand,” says Stephen Richardson, a senior researcher at Microsoft Research, “and it’s growing exponentially.”

Creating effective translation software requires solving many of the same problems that natural-language processing faces, and then some. Both systems must determine from context whether “light bulb,” for example, means a source of light or a less-than-heavy plant bulb. But while customer service software at, say, General Electric could be fairly sure of its interpretation, translation software might have to handle treatises on horticulture as well. Translation software must also contend with idioms whose figurative meaning has nothing to do with their literal meaning-and then find parallel idioms in a different language. It’s a problem that makes word-for-word translation impracticable. 

Researchers are making progress today using three basic approaches drawn from natural-language processing. Knowledge-based machine translation, for example, relies on human programmers to write lists of rules that describe all possible relationships between verbs, nouns, prepositions, and so on for each language. To generate a sensible translation, the software then scours those rules to find matching words and relationships in the target language. IBM employs knowledge-based software in its WebSphere Translation Server, which the company and its global customers use to rapidly translate e-mail, Web sites, and even real-time chats.

A second approach, example-based systems, relies chiefly on raw computing power. Software algorithms search through millions of words and phrases in documents that already have been accurately translated into multiple languages, compare the translations, and then create an enormous database of vocabulary and word relationships to resolve ambiguity for future translations.

Statistical techniques also depend on computing power to compare reams of previously translated text. However, this strategy selects the most likely translation using sophisticated mathematical models that the software continually upgrades based on how often its interpretations prove accurate. If it correctly translates “light bulb,” then the probability score assigned to that phrase for future translations goes up. Microsoft used a combination of example-based and statistical techniques to automatically translate its entire 60-million-word product-support-services knowledge base from English to Spanish.

With accuracy rates hovering between 70 and 80 percent, no system is foolproof. But, says Michael McCord of IBM’s Language Analysis and Translation group in Yorktown Heights, NY, translation software significantly reduces the time it takes for humans to translate documents, and sometimes an approximate translation will do. Meanwhile, accuracy rates keep inching up. “More work, bigger machines, more people. We’ll get better,” he says.

See Computers that Speak Your Language

Want to go ad free? No ad blockers needed.

Become an Insider
Already an Insider? Log in.
More from Rewriting Life

Reprogramming our bodies to make us healthier.

Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    Print + Digital Magazine (6 bi-monthly issues)

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

    Technology Review PDF magazine archive, including articles, images, and covers dating back to 1899

    10% Discount to MIT Technology Review events and MIT Press

    Ad-free website experience

/3
You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.