It’s elemental: Wolfram Alpha may initially have limited scope, a sometimes rigid user interface, and vague source information, but company cofounder (and element collector) Theodore Gray says the leading search engines are afflicted by “a huge failure of imagination” and are bad at math.
In 1993 a newly minted University of Maryland graduate, a brainy Russian with an interest in computers, interned at Wolfram Research. He did some hands-on work on Mathematica’s kernel, or the core of the software. Then he went off to get his master’s degree at Stanford–and to cofound Google. Today Google handles 64 percent of all searches made by Americans, but the erstwhile Wolfram intern, Sergey Brin, is not entirely happy. He dominates an industry, he’s worth $12 billion, and he hobnobs at the World Economic Forum’s annual meeting in Davos, Switzerland. Search technology, however, hasn’t kept pace with his personal rise. “[T]here are important areas in which I wish we had made more progress,” Brin wrote in Google’s 2008 annual report. “Perfect search requires human-level artificial intelligence, which many of us believe is still quite distant. However, I think it will soon be possible to have a search engine that ‘understands’ more of the queries and documents than we do today. Others claim to have accomplished this, and Google’s systems have more smarts behind the curtains than may be apparent from the outside, but the field as a whole is still shy of where I would have expected it to be.”
Among all the leaders in Web search over the years–from Excite (went bankrupt) to Alta Vista (absorbed by Yahoo in 2003) to today’s top five players (Google, Yahoo, Microsoft, Ask, and AOL)–the core approach has remained the same. They create massive indexes of the Web–that is, their software continually “crawls” the Web, collecting phrases, keywords, titles, and links on billions of pages in order to find the best matches to search queries. Google triumphed because its method of ranking pages, based partly on analyzing the linking structure between them, produced superior results. But while the Web has expanded 10,000-fold over the past decade, search engines haven’t made comparable progress in their ability to find specific answers and then put them together intelligently. The Semantic Web–the long-envisioned system in which information is tagged to allow such processing–is still a long way off.
Last year Yahoo launched something called SearchMonkey, which allows Web-page publishers to improve search returns by adding tags telling the search engine’s software, “This is an address,” “This is a phone number,” and so on. (So now, if you search on Yahoo for a restaurant, you may receive, beyond a link to the restaurant’s page, bullets listing the restaurant’s address, its phone number, and a compilation of reviews.) “What SearchMonkey is doing is taking the promise of the Semantic Web and putting it out in the open so publishers can participate,” says Prabhakar Raghavan, head of Yahoo Labs. Google recently began doing something similar, called “rich snippets.”
But such ideas have been slow to spread across the Web, even though the World Wide Web Consortium (W3C), the international standard-setting body led by Berners-Lee, has set out specifications to help implement them more broadly. And even if the W3C standards were broadly applied, they do not offer much guidance on computation, says Ivan Herman, who heads the W3C’s semantic efforts from Amsterdam: “How this data is combined with numerical calculation and math processes is not well defined, and that is certainly an area in which we have to work.”
So while today’s search engines are increasingly broad and useful–expanding into new categories (maps, photographs, videos, news), learning to answer simple questions (“What is the population of New York?”), and even doing basic conversions (“What is 10 pounds in kilograms?”)–they aren’t particularly deep or insightful. “While Google is great,” says Daniel Weld, a computer scientist at the University of Washington and a Semantic Web researcher, “personally I would rather have the ship’s computer on the Starship Enterprise, where you ask high-level questions and it gives the answer, and explains the answer, and then you can say, ‘Why did you think that was true?’ and it takes you back to the source.”
As Stephen Wolfram sees it, he’s providing the infrastructure for answering questions in truly intelligent ways–albeit on subjects biased initially toward geeky domains. “We don’t have the problem of dealing with the vicissitudes of the stuff that’s just sort of out there on the Web,” he says. “We’ve bitten the bullet and said, ‘Let’s curate all this data ourselves!’ It would be great if the Semantic Web had happened and we could just go and pick up data and it would all fit beautifully together. It hasn’t happened.”
Hear more from Google at EmTech Digital.