We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not a subscriber? Subscribe now for unlimited access to online articles.

Emerging Technology from the arXiv

A View from Emerging Technology from the arXiv

Wikipedia's Secret Multilingual Workforce

Wikipedia’s various language editions often carry entirely different content. Now one researcher has identified a small band of multilingual editors who are working to change that.

  • December 13, 2013

Wikipedia aims to provide free online access to all human knowledge. And a cursory look at its vital statistics appear to indicate that it’s well on its way to achieving that. The organisation has 77,000 active contributors working on over 22 million articles in 285 languages. All this attracts some 500 million unique visitors a month.

And yet a look beyond these figures reveals a subtle but important problem: there is surprisingly little overlap between the content in different language editions. No one edition contains all the information found in other language editions. And the largest language edition, English, contains only 51 per cent of the articles in the second largest edition, German.  

This problem is known as self-focus bias and it places a significant limit on the access to knowledge that Wikipedia provides. It means that Wikipedia not only offers people access to a mere fraction of human knowledge but to a mere fraction of its own articles.

There are a group of people who could change this, says Scott Hale at the University of Oxford in the UK. He believes that people who edit Wikipedia in more than one language are the key. “Such multilingual users may serve an important function in diffusing information across different language editions of the project,” he says.

But do they actually play this role? Today, Hale reveals the results of his study of multilingual editors of Wikipedia. He says they turn out to be a small but important minority of editors who play a crucial role in helping to reduce the level of self-focus bias in each edition.

Hale began by crawling the edits to Wikipedia between 8 July and 9 August this year, which are broadcast in near real-time over Internet Relay Chat. He excluded minor edits and those made by bots and unregistered users. That left 3.5 million significant edits by 55,000 editors.

Hale then looked for editors who were active in more than one language edition and found more than 8,000 of them or about 15 per cent of the total. It was these multilingual editors that he studied further.

It turns out that some editions have more multilingual editors than others and in general smaller editions have a higher percentage of multilingual editors. The most significant outliers with the highest proportion of multilinguals were Esperanto and Malay while Japan had significantly fewer multilingual editors than its size would suggest.

Significantly, these multilingual editors are more active than their monolingual counterparts making, on average, 2.3 times as many edits.

What’s more, almost half of the articles added by multilingual editors are not edited at all by monolingual editors. Multilinguals also tend to edit the same articles in different languages. That’s significant because it implies that they are transferring new articles from one edition to another.

“This suggests that multilingual users are making unique contributions not duplicated by monolingual users and that in many cases multilingual users are working on the same article in multiple languages,” says Hale.

That’s interesting work. “Overall, this study shows multilingual users play a unique role on Wikipedia editing articles different to those edited by monolingual users,” concludes Hale.

And that’s an important job. If Wikipedia is to tackle the problem of self-focusing bias, it will need more editors like them. But just where they will come from is another question altogether.

Ref:  arxiv.org/abs/1312.0976: Multilinguals and Wikipedia Editing

Tech Obsessive?
Become an Insider to get the story behind the story — and before anyone else.

Subscribe today
More from Connectivity

What it means to be constantly connected with each other and vast sources of information.

Want more award-winning journalism? Subscribe to Print + All Access Digital.
  • Print + All Access Digital {! insider.prices.print_digital !}*

    {! insider.display.menuOptionsLabel !}

    The best of MIT Technology Review in print and online, plus unlimited access to our online archive, an ad-free web experience, discounts to MIT Technology Review events, and The Download delivered to your email in-box each weekday.

    See details+

    12-month subscription

    Unlimited access to all our daily online news and feature stories

    6 bi-monthly issues of print + digital magazine

    10% discount to MIT Technology Review events

    Access to entire PDF magazine archive dating back to 1899

    Ad-free website experience

    The Download: newsletter delivered daily

You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.