On the evening of April 27 a ferocious rain raked the windows beside Jamie Williams’s cubicle as the physicist sat, exhausted, immersed in the minutiae of food science. On the computer screen before him were raw tables of information from the U.S. Department of Agriculture, containing data on 7,000 foods, from blackberries to beef. He and a four-person team were “curating” the data, readying it for a new kind of online search. He combed through the tabs that identified 150 properties (nutrients, calories, carbohydrates, and so on), making sure the various abbreviations were consistent and readable by computers. He organized foods into groupings to facilitate natural-language queries. A search for nutritional information on “milk” would provide an average value, for example, while “skim milk” would provide a specific answer.
Williams wasn’t toiling in a redoubt of Silicon Valley Web entrepreneurs but in a midwestern citadel of science geeks: Wolfram Research, in Champaign, IL, housed in an office block overlooking a Walgreens and a McDonald’s. This was the corporate lair of Stephen Wolfram, the physicist and maker of Mathematica, which is generally acknowledged to be the most complete technical and graphical software for mathematicians, scientists, and engineers. Williams was working on something his company was calling a “computational knowledge engine”: Wolfram Alpha. In response to questions, Alpha was meant to compute answers rather than list Web pages. It would consist of three elements, honed by hand in Champaign: a constantly expanding collection of data sets, an elaborate calculator, and a natural-language interface for queries.
What might Wolfram’s system do that Google can’t? Say you wanted to know how much cholesterol and saturated fat lurked in a slab of your grandmother’s cornbread. You’d transcribe its ingredients from her yellowed index card to an online query bar, and Alpha would run computations and produce a USDA-style nutrition label. “Sure, you could go to Google, find out calories in a standard egg, and so on–but what a pain in the ass it would be!” exclaimed Wolfram Research cofounder Theodore Gray. “You’d need the data. And you’d need the data to be in forms that can be readily converted, if need be. And you’d need to add them up. You can do it, just as in earlier decades–you could go to the library to find a reference, and today you can go to Google or another search engine to get started. But we make it far easier.” With a conventional search engine, he added, “enter ‘one cup of sugar, one pound of flour,’ and it completely throws up all over your screen.”
This is one example of the sort of thing Alpha was meant to do: provide deeper, more specific, and more graphically dressed-up answers to certain kinds of questions–though a limited set at first. Queries for “D# major” would produce graphics of the musical scale, queries for “Venus” would produce detailed, current maps of the night sky; queries on pairs of companies would produce comparative charts and graphs. It would add extra information: a search for “New York London distance” would produce the answer in miles, kilometers, and nautical miles; a map showing the flight path; and a comparison of how long it would take a jet, a sound wave, and a light beam to make the trip. Ask it about a word (prefaced by the word word), and it would generate etymology tables and synonym networks. To do these sorts of things, it would start with math and science data sets and formulas already held in Mathematica, and build from there. Some of the new information, such as government data on food, would just need minor reorganization. That’s what Williams was doing. Other kinds, such as real-time stock data, required licensing. Still others, such as data on aircraft, would be gathered from open Web sources such as Wikipedia and Freebase, and cleaned up–curated.
Hear more from Google at EmTech 2014.