Semantic Sense for the Desktop

A project brings Semantic Web technology to personal documents.

Erica Naonearchive page

December 16, 2008

People naturally group information by topic and remember relationships between important things, like a person and the company where she works. But enabling computers to grasp these same concepts has been the subject of long-standing research. Recently, this has focused on the Semantic Web, but a European endeavor called the Nepomuk Project will soon see the effort take new steps onto the PC in the form of a “semantic desktop.”

Those working on the project, coordinated by the German Research Center for Artificial Intelligence (DFKI), have been toiling for three years to create software that can spot meaningful connections between the files on a computer. Nepomuk’s software is available for several computer platforms and now comes as a standard component of the K Desktop Environment (KDE), a popular graphical interface for the Linux operating system.

The idea of a semantic desktop is not new. The Open Source Applications Foundation and SRI, two nonprofit organizations, have both worked on similar projects. But previous efforts have suffered from the difficulty of generating good semantic information: for semantic software to be useful, semantic information needs to be generated and tagged to files and documents. But without useful applications in the first place, it is hard to persuade users to generate and tag this data themselves.

Nepomuk is distinguished by a more practical vision, says Ansgar Bernardi, deputy head of knowledge management research at DFKI. The software adds a lot of semantic information automatically and encourages users to add more by making annotated data more useful. It also provides an easy way to share tagged information with others.

The software generates semantic information by using “crawlers” to go through a computer and annotate as many files as possible. These crawlers look through a user’s address book, for example, and search for files related to the people found in there. Nepomuk can then connect a file sent by a particular person with one related to the company that person works for, making Nepomuk a particularly useful way to search a computer, Bernardi says.

While most operating systems let users search on their computer by keyword alone, Nepomuk can uncover more useful information by focusing on the connections between data; it can locate relevant files if they don’t mention the keyword used to search. And peer-to-peer file-sharing architecture built into the system also makes it easy to share files and the associated semantic data between users.

“This might be the semantic desktop that actually survives,” says Nova Spivack, CEO and founder of Radar Networks, the company behind Twine, a semantic bookmarking and social-networking service. “There’s a lot of potential to build on what they’ve done.”

Spivack notes that other efforts to bring semantic technology to the desktop haven’t succeeded in reaching end users. “Nepomuk is designed for real people and developers,” he says. For this reason, Spivack sees the inclusion of Nepomuk in KDE as particularly important, since KDE software is widely distributed and can easily be modified by software developers.

Although funding for the official Nepomuk project ends this month, Bernardi expects it to continue as an open-source software effort. A spinoff company is also in the works, he says, and a newly founded legal body called the Open Semantic Collaboration Architecture Foundation will help coordinate continuing work on the technology created by Nepomuk.

Nepomuk’s software is available in several platforms besides KDE. Users can download the basic software for free for Windows, Macintosh, and Linux. It is also possible to use Nepomuk in a more limited way–just for Web pages viewed through Firefox, for example–with a limited installation.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.