Trends in Search: Technology
The advantage conferred on Google by its PageRank algorithm, once overwhelming, is gradually disappearing. Many other clever algorithms have been developed; indexing and searching are being applied to more data sources and data types; and ever more nuanced user interfaces and functions are being offered to users.
Some of these efforts seem quite promising. Amazon has scanned more than 100,000 books and made their text searchable for Amazon users. Google Print provides a similar service and also offers direct links to bookselling sites. PubSub, a small startup in New York City, has developed a high-performance “matching engine” that monitors online information: if you subscribe to a topic, PubSub will scan data in real time and notify you whenever there is news. For the sorting and clustering of search results, the leader is Vivisimo, a Carnegie Mellon University spinoff in Pittsburgh, with its new Clusty website. Software from Blinkx, of San Francisco, lets users search multiple information sources, including their desktops, websites, and blogs. X1 Technologies of Pasadena, CA, also provides a popular desktop search tool.
As these examples suggest, many new search functions are being introduced by startups rather than by Google or established companies. A few of these startups may become large, independent firms. But most will remain small vendors, will be acquired, or will simply fail, depending on what Google, Yahoo, or Microsoft choose to do. Many offer products that would be natural additions or complements to existing search services, since their utility depends upon access to a search engine. But Google and Yahoo do not usually provide such access, even though it would benefit users. Google’s sole Web API is laughably limited, offering little functionality and contractually restricting users to 1,000 queries per day.
Just what services could be built upon a fully open Google architecture? They could take many forms, but some of the most obvious would make indexing and searching processes on the desktop, on Web servers, and on Google’s own website work together better. A single search could then span not just Google’s index of the public Web but whatever other sources might be appropriate: a newspaper archive, a medical database, an antique-car parts catalogue, or your own hard drive. Google, or others building upon its APIs, would unify the results, explain any access restrictions on particular sources, and facilitate purchases of information. At the same time, independent firms could create services that call on Google’s search and indexing functions to retrieve information, but present that information in new and creative ways.
As the search industry evolves, it also touches upon – and often competes with – a widening array of other industries, from publishing to software, in both business and consumer markets. The search industry wants to become the starting point for a larger proportion of digital activities. Some companies are happy to oblige:Amazon, for instance, opens its databases to search services, so that search results can point directly to relevant Amazon products, bypassing the need to navigate Amazon’s own site. Others are less welcoming. Microsoft will be displeased, to put it mildly, if Google Desktop begins to supplant the traditional Windows desktop interface and file systems.
However, the most important trend in the search industry is the proliferation of new computing platforms – and the increasing cross-pollination of data between these devices, PCs, and Web services. These emerging – and merging – markets represent Google and Microsoft’s greatest opportunity for future growth and the greatest threat they pose to each other. In the absence of a common architecture, the information on these systems is almost unsearchable. Today, a user cannot possibly conduct a search such as “Show me everything about the Chinese economy that has appeared in the last month in my e-mail attachments, Word documents, bookmarked websites, corporate portal, voice mail, or Bloomberg subscription.” Many computing platforms, old and new, have no useful search facilities at all. Most existing search tools are available on only one or at most a few platforms; and due to their lack of standardization, they cannot talk to each other.
Thus, while Google provides an excellent service for searching the public Web and has made a good start on PCs with Google Desktop (the hard-drive search tool) and Google Deskbar (which performs searches without launching a browser), the search universe as a whole remains a mess, full of unexplored territories and mutually exclusive zones that a common architecture would unify. Given the size and growth rate of the markets involved, the dominant search provider a decade from now could easily have revenues of $20 or $30 billion per year.