Computing

Translating the Web While You Learn

A new website will offer free language lessons—and use the results to render Web pages in other tongues.

The creators of a website called Duolingo want to translate the world’s Web pages into new languages by harnessing the efforts of people who are learning those languages.

Multilingual: Luis von Ahn (center), and his research team at Carnegie Mellon University.

If the approach sounds familiar, it’s because a similar idea is the basis of the effort known as reCAPTCHA, which was invented by the same Carnegie Mellon computer science professor behind the new project: Luis von Ahn.

A recaptcha is a string of distorted text shown to a user trying to register for a new account or comment on a Web page; the text comes from electronically scanned print that could not be recognized by a computer. To gain permission, the user must reënter the words correctly. More than a hundred million recaptchas are solved each day. Von Ahn says that if he can capture even a small portion of that audience with Duolingo—say, a million users—he could translate all of Wikipedia’s English entries into Spanish in 80 hours.

Even though the Duolingo site has yet to launch—von Ahn says it will enter private beta “on the order of weeks” from now—he was able to reveal a few details about how it operates. The basic premise is simple: users, even those who have never spoken a particular language before, are presented with short phrases on which to practice. The system helps them by defining some of the words in the phrase.

Users’ attempts to translate the phrase are later voted on by other users, and the most accurate translation “wins.” In a talk given at a recent TEDx conference at Carnegie Mellon, he said that the results “are as accurate as translations from professional language translators.”

As for Duolingo’s capabilities as a language teacher, Von Ahn says his team’s tests indicate that users “do about as well as with other methods.”

Duolingo has one big advantage over other language tools: it’s free. This means its potential audience is enormous, encompassing anyone with a computer. Eventually, says von Ahn, he wants to make the system accessible by mobile phones, which would increase its reach by hundreds of millions, if not billions, of potential users.

“These guys are brilliant,” says Christopher O’Donnell, former head of product at Transparent, a maker of language-learning software whose customers include the U.S. Department of Defense. “They might be on to something super elegant and amazingly perfect, like recaptcha was. If they do the recaptcha of language, it’s massive.”

But Duolingo’s success will depend in no small part on whether the site can keep users coming back. To that end, the system has been tested and updated continually since the fall of 2010.

“A huge part [of making it successful] is that you just need to experiment. It’s nothing but trial and error,” says Severin Hacker, a PhD student at Carnegie Mellon and the lead architect of Duolingo.

Initially, Duolingo will launch with just three languages: English, Spanish, and German. The eight-member team working on the project had originally intended to tackle more, but they soon discovered that development time was too slow for languages that were not native to at least one team member.

Severin says the multilingual nature of Duolingo was one of the biggest challenges in its development. Users with keyboard layouts intended for English, for example, cannot easily generate special characters used in other languages, such as the umlaut in German. As a result, developers and the team’s designer had to put a lot of effort into honing the interface, including developing a fast and intuitive virtual keyboard for generating these characters.

Aside from the promise of free language instruction, it isn’t clear how Duolingo may entice users. But many of von Ahn’s past projects have involved casual games designed to encourage them to perform useful tasks that computers can’t manage on their own. (One game he developed, which makes it fun for users to label images, was later purchased by Google and now enhances the utility of Google Image Search.)

“A hard thing when learning a language is just staying motivated,” says von Ahn. “A large fraction of people want to learn a language, but at end of day it is hard to do it. We had to tackle that problem.”

Another hurdle to the success of Duolingo is that unlike recaptcha, which is embedded on countless websites, Duolingo will require that users show up in the first place. Von Ahn says he has no idea whether it will generate sufficient attention, but interest is already high. So many people have already signed up for the private beta, he says, that “if we only had those users, we could already translate a lot of stuff.”

Hear more from CMU at EmTech 2016.
Register today

Uh oh–you've read all five of your free articles for this month.

Insider Online Only

$19.95/yr US PRICE

Computing

From the latest smartphones to advances in quantum computing, the hardware behind today's digital age is rapidly changing.

You've read of free articles this month.