The Translation Challenge
In the early, post-World War II days of computing, scientists dreamed of creating software so intelligent it could accurately translate one language into another. If computers could crack enemy codes, the thinking went, then why not foreign languages? Five decades later, researchers are still working on the problem. But what was a dream in the 1950s has become an overwhelming demand as business increasingly ignores traditional borders. In fact, by 2007, the already huge translation market-mostly manual today-is expected to reach $13 billion, as advertising, Web pages, and even internal company documents have to be tailored to different countries. “There’s so much more text than anybody could hope to translate by hand,” says Stephen Richardson, a senior researcher at Microsoft Research, “and it’s growing exponentially.”
Creating effective translation software requires solving many of the same problems that natural-language processing faces, and then some. Both systems must determine from context whether “light bulb,” for example, means a source of light or a less-than-heavy plant bulb. But while customer service software at, say, General Electric could be fairly sure of its interpretation, translation software might have to handle treatises on horticulture as well. Translation software must also contend with idioms whose figurative meaning has nothing to do with their literal meaning-and then find parallel idioms in a different language. It’s a problem that makes word-for-word translation impracticable.
Researchers are making progress today using three basic approaches drawn from natural-language processing. Knowledge-based machine translation, for example, relies on human programmers to write lists of rules that describe all possible relationships between verbs, nouns, prepositions, and so on for each language. To generate a sensible translation, the software then scours those rules to find matching words and relationships in the target language. IBM employs knowledge-based software in its WebSphere Translation Server, which the company and its global customers use to rapidly translate e-mail, Web sites, and even real-time chats.
A second approach, example-based systems, relies chiefly on raw computing power. Software algorithms search through millions of words and phrases in documents that already have been accurately translated into multiple languages, compare the translations, and then create an enormous database of vocabulary and word relationships to resolve ambiguity for future translations.
Statistical techniques also depend on computing power to compare reams of previously translated text. However, this strategy selects the most likely translation using sophisticated mathematical models that the software continually upgrades based on how often its interpretations prove accurate. If it correctly translates “light bulb,” then the probability score assigned to that phrase for future translations goes up. Microsoft used a combination of example-based and statistical techniques to automatically translate its entire 60-million-word product-support-services knowledge base from English to Spanish.
With accuracy rates hovering between 70 and 80 percent, no system is foolproof. But, says Michael McCord of IBM’s Language Analysis and Translation group in Yorktown Heights, NY, translation software significantly reduces the time it takes for humans to translate documents, and sometimes an approximate translation will do. Meanwhile, accuracy rates keep inching up. “More work, bigger machines, more people. We’ll get better,” he says.
Keep Reading
Most Popular

Why China is still obsessed with disinfecting everything
Most public health bodies dealing with covid have long since moved on from the idea of surface transmission. China’s didn’t—and that helps it control the narrative about the disease’s origins and danger.

These materials were meant to revolutionize the solar industry. Why hasn’t it happened?
Perovskites are promising, but real-world conditions have held them back.

Anti-aging drugs are being tested as a way to treat covid
Drugs that rejuvenate our immune systems and make us biologically younger could help protect us from the disease’s worst effects.

A quick guide to the most important AI law you’ve never heard of
The European Union is planning new legislation aimed at curbing the worst harms associated with artificial intelligence.
Stay connected

Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.