Adding Meaning to Millions of Numbers
Semantic technology could keep numbers tied to the information that explains what they mean.
Out of context, a number can be a dangerous thing. In 1991, for example, NASA’s $125 million Mars Climate Orbiter was destroyed because one team of engineers used imperial units of measurement while others relied on metric ones. Consequently, the spacecraft was ordered to orbit at a perilously low altitude, which caused it to burn up in the Martian atmosphere.
Similar number-related mix-ups occur all the time back on Earth, albeit with slightly less catastrophic consequences. True Engineering Technology, a startup based in Cambridge, MA, has now developed semantic technology that adds meaning to numerical data to help prevent such miscommunications from occurring.
Today the company is launching a website called Numberspace that lets users upload pieces of numerical data–the distance from New York to London in kilometers, for example (5,581). Once it is semantically tagged, the information can be shared without losing its meaning. Customers can also pay for a business version that stores their information on a private server.
To store a number in the system–creating what the company calls a “truenumber”–a user simply types a short phrase into a form on the website. For the example cited above, a user might type, “The distance from New York to London in kilometers is 5,581.” True Engineering Technology’s software then interprets the phrase, extracting the number 5,581 and the meaning conveyed by key words and phrases such as “New York,” “London,” and “distance.” Users can also add notes and comments to the truenumber, such as how it was calculated and whether there are queries concerning its accuracy.
Allen Razdow, CEO of True Engineering Technology, believes the technology will interest businesses that rely on accurate numbers for important decisions, particularly engineering firms. For example, workers at an automotive plant might need to access the most recent emissions figures stored from a range of different electronic documents. By storing these figures as truenumbers, any worker could search for them online and find the latest, most accurate figure. The system could also automatically perform conversions from one unit to another and flag any potential mismatches and errors. Users can also paste truenumbers onto websites or into Microsoft Office documents, keeping them linked to the contextual data stored on the Web server.
Razdow sees the technology as part of the movement toward the semantic Web, which aims to let computers store the meaning of pieces of information as well as the information itself. “I think there are natural, practical evolutions of how information gets represented that get to be more semantic by degrees,” Razdow says. Numbers, he argues, are a good place to start because people follow certain rules when speaking about them. For example, it’s easy to train the system to recognize which units represent distances, and to throw up error alerts if the user types that a distance is “5,581 seconds.”
“In the past, we thought of numbers as having an absolute value and an absolute measure of their authority, but of course that’s not really ever the case,” says Bruce Jenkins, an analyst at Ora Research in Cambridge, MA, who’s been briefed about the technology.
“The pedigree and authority of numbers in a system will become so much more visible with this technology,” Jenkins says, “and weaknesses, questions over the authority of numbers, will become much easier to see and correct.”
Jim Hendler, a professor of computer science at Rensselaer Polytechnic Institute who studies the semantic Web, says that the approach is in line with how he expects semantic technologies to be commercialized. “The semantic Web is really an infrastructure technology that’s used to enhance what we already do on the Web and to create new applications,” he says.
Some observers have questioned whether users will take the time required to add contextual information to semantic systems and to keep them up to date. But Hendler says that if a system is well designed, adding contextual information can actually save users time in the long run. “It’s a mistake to think that metadata makes things harder or causes extra work,” he says.
Though Numberspace is being marketed primarily to engineering companies, Razdow says the website could also be used in any industry that deals heavily with numbers. Research firms, journalists, and academics, for example, might all benefit, he says.