A Google Killer Stumbles

Cuil’s rough launch shows the difficulty of challenging major search engines.

Erica Naonearchive page

July 31, 2008

Boasting big plans, startup search engine Cuil (pronounced “cool”) launched on Monday. The company sold itself on having indexed more pages than Google, ranking based on context rather than on popularity, and displaying results organized by concept within a beautiful user interface. There was just one problem: when the search engine launched, it didn’t work very well.

Cuil’s site was down intermittently throughout the day on Monday, and even when the site was up, it sometimes returned no results for common queries, or failed to produce the most relevant or up-to-date results. For example, as of Wednesday morning, searching Cuil for its own name returns nothing on the first results page that is related to the engine itself, in spite of the buckets of press it got this week.

“I’ve seen these sorts of things for all sorts of startups that get launched,” says search-engine expert Danny Sullivan, who runs Search Engine Land. “You have issues with how it’s displaying results; you have spam showing; you have a lot of duplicate results.” But Cuil wasn’t supposed to suffer from the common problems that all sorts of startups encounter. Its founders have impressive credentials: Anna Patterson and Russell Power both had major roles in building Google’s large search index, and Tom Costello researched search architecture and relevance methods for Stanford University and IBM. On top of the company’s talent, Cuil raised a reported $33 million in venture capital. “In many ways, Cuil was the exception,” Sullivan says. “They were one of the few people or companies out there where you would say, ‘Well, all right, I’d be dubious about anyone else, but if anyone’s going to have a chance, you should have a chance.’ But they didn’t deliver, and I think that makes it even harder now for startups to come along.”

One of Cuil’s main selling points is the size of its index. Claiming to have indexed 120 billion Web pages, which it states is three times more than any other search engine, the company says, “Size matters because many people use the Internet to find information that is of interest to them, even if it’s not popular.” But Sullivan notes that relevance may be the most important quality of search. “When you come into the idea of size, that starts getting into the question of obscure search,” he says. “The needle-in-the-haystack search sounds so very compelling–the idea that if you don’t have a lot of pages, you can’t search through the entire haystack. But, as Cuil has demonstrated very well, it doesn’t help you to look through the entire haystack if it gets dumped on your head, and all you can see is a bunch of hay out there.”

British investor Azeem Azhar, who has a strategy role at the startup search engine True Knowledge, notes that while it’s useful to have a large base of knowledge, sometimes the sample that’s selected matters more. “There are certain things that people expect to have, and there are certain facts that are more useful than others,” he says. True Knowledge, which aims at the subset of searchers who are looking for answers to direct questions, is currently working on building up a database of relevant facts that can be used to answer questions such as, “Who was president when Barack Obama was a teenager?” The company hopes that by focusing on facts of broad interest, such as those relating to famous people and places, it will be useful to people even as it solicits responses for them by way of rounding out its database. When a user asks a question that the system can’t answer, it returns, “If there are any answers, I couldn’t find any”; invites the user to add to the database; and points to traditional search results.

Azhar also notes that it’s hard to approach many common search problems directly. For example, while many companies are trying to improve search by parsing documents using natural-language processing or, like Cuil, analyzing them for context, True Knowledge is building a database containing facts and their relations to each other. “It’s a testament to how difficult it is to improve automatic understanding of documents that we said we can build a database of several hundred million facts more easily,” he says.

True Knowledge, which is still in a private experimental release, has no plans to go head to head against the majors. Azhar says that the company may eventually try to sell its services to existing portals as a feature that could enrich traditional search results.

That may be the safer approach. Positioning yourself as an alternative to Google, or, for that matter, to Microsoft’s and Yahoo’s search engines, is highly unlikely to be a viable strategy at this point, Sullivan says. “[Startups can] really underestimate the amount of work that’s involved with the incredible task of trying to compete with Google.” Instead, he adds, startup search engines might do better to present themselves as supplementing what the existing major search engines offer, or as providing good results for particular types of content.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.