In this way, the Piggy Bank researchers hope, Web users can begin to get a taste of the Semantic Web in action, without having to wait for the authors of the billions of documents on the Web to create metadata. The curious can download a Piggy Bank extension for the Firefox Web browser; once the extension is installed, users can choose from a number of “screenscrapers” that extract information from specific sites like LinkedIn and Flickr (a popular photo-sharing site). Piggy Bank stores this “pure information,” such as photos or contact names, inside the Web browser in RDF format, theoretically allowing users to mix data from independent sources to create their own “instant mashups” similar to the LinkedIn-Google Maps example.
Unfortunately, there aren’t yet any tools that make it easy for nonprogrammers to reuse the RDF data in such mashups. And in my own tests of Piggy Bank, the screenscrapers failed to activate. I’m sure that’s because I missed something in the instructions–but the problem does illustrate how much more work is needed before such tools will be ready for public consumption.
A second category of post-Web 2.0 projects focuses not on helping machines understand the meaning and the uses of existing Web content, but on recruiting real people to add their intelligence to information before it’s used. The best known example is Amazon Mechanical Turk, a kind of high-tech temp agency introduced by the online retailer in 2005. The service allows people with tasks and questions that computers can’t handle–for example, spotting inappropriate images in a collection of photos–to hire other Web users to help.
The employment is extremely temporary–less than an hour per task, in most cases–and the pay is ridiculously low: solutions typically earn the worker only a few cents. But the point isn’t to provide Internet addicts with a second income: it’s to harness users’ brainpower for a few spare moments to carry out simple tasks that remain far beyond the capabilities of artificial-intelligence software. (In fact, Amazon calls its project a form of “artificial artificial intelligence.”)
Some tasks are really marketing or product research in disguise. One questioner, for example, asks, “What would make your e-mail better?” Others offer better illustrations of the logic behind breaking up a big data-classification task and distributing it to hundreds of people. One task, apparently from someone trying to make it possible to share information between various Yellow Pages-style directories, asks users to match categories from one directory–say, “Delicatessens”–with the closest equivalents in another–for example, “Delis” or “Small Restaurants.” A computer couldn’t tackle such a task without years of training on the mundane facts of human existence, such as the fact that a delicatessen is indeed one form of a small restaurant. A human, however, can find the right matches in seconds.
Another project that attempts to persuade humans to add meaning to raw data is the Google Image Labeler. It entices users to label digital photographs according to their content by making the task into a simple game in which contestants must both collaborate and compete. Like Amazon Mechanical Turk, the Image Labeler has a community of fans who enjoy it as a game. And there’s nothing wrong with making potentially dull tasks entertaining, if that’s what it takes to motivate “workers.” But the Image Labeler and the Mechanical Turk will have to grow beyond their toylike demonstration stage before they have a real impact on the Web’s usability.
It’s not surprising that observers are reaching for new labels to describe the work going on beyond the boundaries of today’s Web 2.0. But most of these projects are so far from producing practical tools–let alone services that could be commercialized–that it’s premature to say they represent a “third generation” of Web technology. For that, judging from today’s state of the art, we’ll need to wait another few years.