You can’t have a Semantic Web without metadata, but metadata alone won’t suffice. The metadata in Web pages will have to be linked to special documents that define metadata terms and the relationships between the terms. These sets of shared concepts and their interconnections are called “ontologies.”
Say, for example, that you’ve made a Web page listing the members of a faculty. You would tag the names of the different members with metadata terms such as “chair,” “associate professor,” “professor” and so on. Then you’d link the page to an ontology-one that you created yourself or one that someone else has already made-that defines educational job positions and how they relate to each other. An appropriate ontology would in this case define a chair as a person, not a thing you sit on, and it would indicate that a chair is the most senior position in a department.
By defining the relationships between terms, ontologies can then be used by applications to infer new facts. Suppose you have created a Web page that teaches schoolchildren about condors, and have added metadata to the content. You could link to an ontology (or more likely, several ontologies) that define the various terms and their relationships: “California condor is a type of condor from California.” “Condor is a member of the raptor family.” “All raptors are carnivores.” “California is a state in the United States.” “Carnivores are meat eaters.” By using both metadata and ontologies, a search engine or other software agent could find your condor site based on a search request for “carnivores in the U.S.”-even if your site made no mention of carnivores or the United States.
Because ontology development is a big undertaking, it’s likely that site creators will link to third-party ontologies. Some will be free, others will be sold or licensed. One issue that will have to be confronted: just as with dictionaries and atlases, political and cultural bias will creep into ontologies. A geography-based ontology maintained by the Chinese government, for instance, would probably not define Taiwan as a “country.”
But that hardly impedes the vision. As the World Wide Web Consortium continues to develop standards and technologies for the Semantic Web, hundreds of organizations, companies and individuals are contributing to the effort by creating tools, languages and ontologies.
One major contributor is DARPA-the folks responsible for a great deal of the technology behind the Internet (see “DARPA’s Disruptive Technologies,” TR October 2001). These days, DARPA is contributing tens of millions of dollars to the Web consortium’s Semantic Web project and has developed a semantic language for the U.S. Department of Defense called DARPA Agent Markup Language that allows users to add metadata to Web documents and relate it to ontologies. University of Maryland computer science professor Jim Hendler-who was until August manager of the DARPA program-has been working closely with Berners-Lee and Miller to ensure consistency with the consortium’s efforts. Last December, Hendler announced the creation of a language that combines the DARPA Agent Markup Language’s capabilities with an ontology language, developed in Europe, called OIL (which stands for both Ontology Inference Layer and Ontology Interchange Language).
A developer of this new language, University of Manchester lecturer Ian Horrocks, also advises the World Wide Web Consortium on the Semantic Web. In January, he cofounded a company called Network Inference to develop technology that uses ontologies and automated inference to give Semantic Web capabilities to existing relational databases and large Web sites. Recently, an Isle of Man-based data services company called PDMS began using Network Inference’s technology to add Semantic Web capabilities to corporate databases. Dozens of other companies, from Hewlett-Packard to Nokia, are contributing to Semantic Web development.