We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Rewriting Life

The World Wide Translator

Will Web-wide “translation memory” finally make machine translation pay off?

“Hour is the moment for all the good men to come to the subsidy of them country.”

Hardly a rousing cry. Despite hundreds of millions of dollars and decades of research, such gibberish typifies the results of language translation software. As a result, the translation business hasn’t come very far from its days as a cottage industry-an expensive, time-consuming process dependent on highly specialized human translators.

Globalization companies hope to break through this barrier with software that employs translation memory-a way to use past translations to speed new ones. But building a useful database of translations is a slow and expensive endeavor, and companies guard their translations jealously.

Even worse, globalization software makers have been slower than other high-tech industries to develop standards for interoperability. If, for example, General Motors decides to switch translation software, it can’t take its translation memory with it-a potential loss of millions of dollars of intellectual property.

“You might have a huge translation memory, but if your client requires you to use another tool, you can’t use it,” says Kara Warburton, a terminology expert at IBM. Warburton belongs to two industry groups working toward a solution: a technical committee at the International Organization for Standards, and the Localization Industry Standards Organization, a trade group.

Their ultimate goal: when anyone, anywhere, corrects the sentence above, it will forever after translate: “Now is the time for all good men to come to the aid of their country.”

Extremely Complex

“This whole area of language is extremely complex,” says IDC analyst Steve McClure. “It’s probably the most complicated problem in computer science that I’m aware of.”

Computer-assisted translation typically involves two steps. First, a rules engine parses the original sentence, attempting to identify the relationships between the words. The engine then translates each word within the context that it believes to be correct-often with mixed results.

That’s how most machine translation works, including Altavista’s Babelfish Web site (source of the example above, translated from English to Italian and back) and freetranslation.com.

“Unfortunately,” says Mark Lancaster, CEO of SDL International, a London-based globalization firm, “the way that we speak is very ambiguously. And so it’s very difficult to interpret random input, which is essentially how we speak.” As a result, no matter how good a rules engine is, a human translator still must correct its mistakes (“Hour is the moment”).

This second step remains the most time-consuming and expensive aspect of translation, often requiring expertise in a specific technical field as well as in the source and target languages. Moreover, two human experts may translate the same passage differently in texts where consistency is desired.

To correct this problem, translation memory stores the human-corrected translation along with the original, non-translated text. For each document, the software compares each sentence of the original to its growing translation memory.

When it finds a sentence it has seen before, it uses the remembered translation instead of the rules engine-knowing, instead of guessing. It then flags the new sections, cutting down the time spent by human reviewers. And as it adds each successive document to its translation memory, it knows more and guesses less.

For closely related sentences, fuzzy matching allows the software to produce a partial translation while flagging the differences for a human reviewer.

While not all computer-aided translation incorporates translation memory, many globalization software providers, including Trados, Mendez, Star AG, Atril, SDL, and Alchemy Software offer products that do.

Who Wants to Play?

Lancaster is excited about the potential to share translation memories. “We’ve been building translation memories for ten years, so we have pretty big database repositories,” he says.

For now, Lancaster says, SDL uses those databases only for its own translation work but plans to develop a shareable one: customers using SDL’s translation software, SDLX, will gain access to a massive database of past translations. The price of admission? Customers will have to share their resultsWho or pay a premium to keep them private.

But the idea remains controversial. Would a company willingly share its intellectual property, potentially with competitors? They might in exchange for a discount, claims Lancaster.

Such a tradeoff may appeal to small or medium-sized companies, says McClure, but large companies consider their translation memories valuable intellectual property and would be unlikely to share them.

“If Cisco has to go to the trouble of translating the gigabit router instructions to Mandarin Chinese, that’s not going to be easy,” agrees analyst Eric Schmitt of Forrester Research. “It’s going to be expensive. Cisco doesn’t want to go to the trouble and then have Alcatel and Juniper come along and get the same benefit.”

Still, while these challenges remain great, they may not be computer translation’s largest stumbling block, says David Parmenter of Basis Technology in Cambridge, MA, a firm that assists companies in moving their business worldwide.

“The bulk of the translation business is built on foreign translators who do the work piecemeal,” Parmenter says. “It’s hard to beat the economics of that.”

Want to go ad free? No ad blockers needed.

Become an Insider
Already an Insider? Log in.
More from Rewriting Life

Reprogramming our bodies to make us healthier.

Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    Print + Digital Magazine (6 bi-monthly issues)

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

    Technology Review PDF magazine archive, including articles, images, and covers dating back to 1899

    10% Discount to MIT Technology Review events and MIT Press

    Ad-free website experience

You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.