Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo

 

Unsupported browser: Your browser does not meet modern web standards. See how it scores »

September 11 affected millions of people in myriad ways. For Ed Bice, an American ex-architect, it sparked a desire to get ordinary Middle Easterners–and Westerners–talking together. Naturally, being based in the Bay Area, he turned to the Web for help.

The result, six years later, is Meadan, which means “town square” in Arabic. The basic idea is simple: it’s a website that brings English and Arabic speakers together around daily postings of news articles, broadcasts, and events that are of common interest, and it gives users a platform to communicate through dialogues, blogs, and other exchanges. All the while, it allows users to pinpoint their location so that people can share views across continents.

The hard part is creating a system that allows users to express their ideas in their native tongue. Enter IBM, which has committed $1.7 million to this not-for-profit project. The company has one of the most advanced systems for Arabic-English machine translation. It’s 84 percent accurate and can transmute Arabic to English and back again at a blistering 500 words per second.

This is no easy task, says Salim Roukos, a senior manager for multilingual natural-language processing technologies at IBM’s Watson Research Center. Because word order in Arabic sentences differs from word order in English, verbs can get lost–quite literally–in machine translation. Moreover, Arabic words have prefixes, suffixes, and other forms that allow them to agree in gender and number–a rigor that freewheeling English lacks and that makes translation from English to Arabic even trickier.

IBM’s statistically based translation system has been trained on a massive amount of material, called a parallel corpus, in both modern standard Arabic and formal English–the language of news reports. That means it has roughly 100 million words and more than 10 million phrases to call upon when presented with new text. But the system struggles with slang and other colloquialisms–all the more difficult in Arabic because street talk varies from country to country.

But this is exactly the sort of language that Meadan’s online community will use. So the alpha test, which was launched last month, also calls on the services of human translators to correct IBM’s machine translations. There is plenty of work to be done. Even a basic English expression like “That’s great!” comes out of the machine as the equivalent of “That’s big!” in Arabic. It’s up to users to point this out and up to designated translators to fix it. The correct pair of translations then becomes another piece of data from which the machine can learn.

Meadan hopes to roll out a beta version later this year–provided it raises the $2 million or so it needs to move forward. Bice has high hopes. “A year from now, I hope we are a global social network, talking across languages about events in the world.” Insha’allah, as we say in Arabic.

Hear more from IBM at EmTech 2014.

Register today

4 comments. Share your thoughts »

Tagged: Communications, IBM, language, translation

Reprints and Permissions | Send feedback to the editor

From the Archives

Close

Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me