We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not a subscriber? Subscribe now for unlimited access to online articles.

Intelligent Machines

Translation by Numbers

Language Weaver’s machine-translation software aids homeland security.

Bryce Benjamin knew a winner when he saw one. It was December 2001, and the infotech entrepreneur was meeting with two professors who were starting a company to commercialize “statistical machine translation.” Their breakthrough: software that could learn automatically to translate text from one language into another.

Benjamin believed the technology was desperately needed for purposes of both homeland security and business communication. The company’s location – on the water in sunny Marina del Rey, CA – didn’t hurt either. “I looked out at the view,” he says, “and I thought, ‘This deal has a lot of promise.’”

Today the trio’s 35-person startup, Language Weaver, is one of the leading companies in the burgeoning field of machine translation. For U.S. counterterror translators facing a growing backlog of untranslated audiotapes and communiques, software is increasingly the weapon of choice. Multinational corporations like Google, Yahoo, and Microsoft – not to mention smaller companies with global staff – are also driving the demand for machine translation of technical documents and Web pages.

This story is part of our October 2005 Issue
See the rest of the issue

Language Weaver’s software translates text between English and half a dozen other languages, including Arabic, Chinese, and Spanish. So far, the technology is most useful as a screening tool that monitors reams of foreign-language news broadcasts, chat rooms, and websites. “People use our translation software to determine the relevance of information, as a triage function,” says Benjamin, the company’s CEO. “It’s very good at telling what a certain passage is about.”

Most machine-translation systems work on individual words or use complicated sets of translation guidelines, which must be devised by linguists and coded by hand. Language Weaver’s technology, which company cofounders Kevin Knight and Daniel Marcu developed at the University of Southern California’s Information Sciences Institute (ISI), takes a different tack. It uses human translation data, such as United Nations transcripts, to set up “parallel corpora” of text passages in two languages, aligned sentence by sentence.

From these side-by-side comparisons, the software learns to translate between the languages – extracting statistical patterns that indicate that a particular grouping of words in Arabic, say, tends to correspond to certain words in English. The system translates phrase by phrase, so if it encounters the words “interest rate,” it will associate them with banks and finance, not curiosity and speed. And the machine-learning approach means the translations should improve with time. “The more data you add, the better the performance will be,” says Knight.

Until recently, this statistical approach, which has roots in 1940s wartime cryptography, was too slow to be useful. But on a modern PC, Language Weaver’s software can translate 5,000 words a minute at state-of-the-art accuracy levels; on a network of servers, it can handle up to 500,000 words a minute.

The names of the company’s U.S. government customers are tightly under wraps, but Benjamin says feedback from the intelligence community has been overwhelmingly enthusiastic and that the technology “played an important role in a mission that saved lives.”

But human translators won’t lose their jobs anytime soon. To accurately do some translations – of, say, technical material – Language Weaver’s system must be trained on similar texts translated by hand. And some experts are skeptical, since the statistical approach hasn’t solved the deeper problem of getting computers to understand natural language.

“It is good for the field that companies like Kevin’s succeed,” says Sergei Nirenburg, a machine translation expert at the University of Maryland, Baltimore County. But, he adds, “you will still have to file the product under ‘Good Uses for Crummy Machine Translation.’”

Nevertheless, Language Weaver has been profitable since late 2003 and is now investigating business applications. At ISI, Knight and Marcu are testing software for a handheld translator that can handle questions and answers between doctors and patients. An outside company is also developing a real-time Arabic-English translator for instant messaging built around the Language Weaver software.

But the biggest market may be multilingual search: typing a query in English could bring up scores of previously invisible foreign websites that could be translated into English. In five years, says Knight, expect to be able to read any Web page in practically any language. “You’ll wonder how you ever did without it,” he says. “But you’ll still laugh at the translation mistakes.”

Learn from the humans leading the way in intelligent machines at EmTech Next. Register Today!
June 11-12, 2019
Cambridge, MA

Register now
More from Intelligent Machines

Artificial intelligence and robots are transforming how we work and live.

Want more award-winning journalism? Subscribe to All Access Digital.
  • All Access Digital {! insider.prices.digital !}*

    {! insider.display.menuOptionsLabel !}

    The digital magazine, plus unlimited site access, our online archive, and The Download delivered to your email in-box each weekday.

    See details+

    12-month subscription

    Unlimited access to all our daily online news and feature stories

    Digital magazine (6 bi-monthly issues)

    Access to entire PDF magazine archive dating back to 1899

    The Download: newsletter delivery each weekday to your inbox

You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.