Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo


Unsupported browser: Your browser does not meet modern web standards. See how it scores »

{ action.text }

Möller says an example of the kind of services that could be enabled is WikiPics, developed by Daniel Kinzler at the German Wikimedia foundation. Kinzler scraped a database of all the links that connect different Wikipedia pages available in multiple languages and built a fully multilingual image search. When a user puts in the term “horse,” for example, the service knows to also find images of “cheval” (French) and “Pferd” (German). “You’re searching concepts instead of terms,” says Möller. However, for now the site relies on the slow process of scraping the whole of Wikipedia to update its knowledge. A semantic Wikipedia would maintain a live database that could be queried at any time.

Wikipedia faces two big challenges in embracing semantic concepts, says Möller. One is that no one has yet built a semantic web service on the scale of a site such as Wikipedia, and it is unclear whether existing software like Semantic MediaWiki is up to the task, he says.

A second challenge is the feature of Wikipedia most responsible for its success so far: its community. “Thinking about adding semantic structure is a natural extension of what Wikipedia needs to do, given prevailing trends,” says Andrew Lih of the University of Southern California, and author of the 2009 book The Wikipedia Revolution. “But I do worry a bit about the database aspect that comes with this–the attraction of wikis in the first place is in the way they have been hand-edited by humans.”

Parscal has been leading efforts to make it easy for anyone to add or edit the data of a large semantic store. “We’ve been working on a visual editor that suggests how we might help users contribute structured data, and that also makes the editing process easier,” says Parscal.

Editing Wikipedia today is already a daunting process that needs improvement, admits Parscal. “If you’ve interacted with our interface,” he explains, “you’ve been slapped in the face by wikitext” (a markup language that uses special code around text to format things like links, references, and section headings). The wikitext for tables or infoboxes–the information most ripe for making semantic–is particularly dense and hard to understand, says Parscal. “We recently did some user experience studies with people that hadn’t used it before; they were quickly quite frustrated.”

In future, it may be possible to remove the need for a human to populate some parts of Wikipedia altogether, says Möller. “Fundamentally a lot of this data probably shouldn’t be entered by humans in the first place, it should just, say, poll the source of a figure like GDP once a year.” That’s a capability that Koren has already added to Semantic MediaWiki, through an extension called ExternalData.

9 comments. Share your thoughts »

Tagged: Web, search, Web 2.0, semantic, Wikipedia

Reprints and Permissions | Send feedback to the editor

From the Archives


Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me