Two weeks later–on the eve of the Wolfram launch–Google threw an event dubbed “Searchology,” at which it announced another new data-crunching service, Google Squared. The technology, now available on its Google Labs site, combines information from different Web sources and packages it into nice tables. A search for “roller coasters,” for example, produces a table of American amusement-park rides from Excalibur to Montezooma’s Revenge. Columns provide thumbnail photos, descriptions, heights, and lengths. Users can click on the results to delete errors in the original table and refine the search. Marissa Mayer, a Google vice president, said during the event that Google Squared “pushes search in an entirely new direction.” She added, “It is a hard computer science problem–to take this unstructured information and present it in a structured way.”
Google also said it would provide better real-time data in search results. If you search for “earthquakes San Francisco,” Google, like Alpha, will now push out the latest relevant reports from the U.S. Geological Survey. (It is doing something similar with real-time data on airline flights and sports scores.) Peter Norvig, Google’s research director, told me that the technologies represent a foretaste of the company’s efforts in finding, combining, and presenting numerical data. “I would say in general, our approach would be more toward open systems rather than closed, curated ones,” Norvig said, “but I do appreciate the broad kind of user interface [Wolfram Alpha is] providing, and the data analysis tools. We would like to do more of that. Maybe having him out there will push us to release more, faster–I don’t know.” Scott Kim, executive vice president for technology at the search engine Ask, was more direct in suggesting that Wolfram would have an influence. “I think it opens people’s eyes–the general public’s eyes–to what you can get out of a computational engine, and how that can be integrated into a search engine,” he said of Alpha. “This is absolutely part of the future of search, and there is a long way to go.”
Ever the showman, Stephen Wolfram made sure the two supercomputers were dramatically lit with blue and green LED lights. In the new data center near Champaign, he’d set up a slightly elevated command post for himself. He’d arranged for the event to be webcast. And at 10:30 p.m. on May 15, as a tornado watch covered much of Illinois, he mouse-clicked a tab labeled “Activate,” and a wall-mounted screen showed computer clusters blinking to life. “Statistically speaking, there will be some issues, and it’s only a question of what issues they are,” he said that night. But despite voltage fluctuations and earlier overloads on the supercomputers, Wolfram avoided a crash of the sort that bedeviled Cuil.
His engine itself still faced a big performance problem, however. As complete and elegant as it was when it knew something, there was much it did not know (and it was hard to guess what it might know). Wolfram Alpha isn’t sure what to do with your input was a frequent response from the site. This was mainly because of the huge gaps in its curated data; Alpha is a library whose shelves are only partly filled. It is largely blind to history, politics, literature, sports, social sciences, and pop culture. The site was also bedeviled by an inflexible natural-language interface. For example, if you searched for “Isaac Newton birth,” you got Newton’s birth date (December 25, 1642; you also learned that the moon was in the waxing-crescent phase that day). But if you searched for “Isaac Newton born,” Alpha choked. Aaronson tested it with me and found it couldn’t answer “Who invented the Web?” and didn’t know state-level GDP figures, only national ones. But it could ace all sorts of math questions, including a request for the surface area of the earth. Aaronson asked it, “What is the GDP of Ireland divided by the cosine of 42?” and received a chart reflecting the relevant calculations for GDP figures from 1970 to 2007, presented on a logarithmic scale.
Finally, there was a documentation problem. Clicking links revealed a variety of sources: the CIA’s World Factbook, the website Today in Science History, the U.S. Geological Survey, Dow Jones, and the Catalogue of Life, an internationally maintained index of the world’s known species. But nothing specified which source had provided which fact. (Gray says the company is working on adding such labels to specific facts and to computed results.)
Hear more from Google at EmTech Digital.
Watch video from the event