It’s been no secret that Jimmy Wales, Wikipedia’s founder, has been thinking it’s time for a new style of Internet search engine. He has made it plain in public remarks, in postings to electronic mailing lists, and elsewhere over the past six months that he sees drawbacks to fully software-driven search engines such as Google and Yahoo. He has also made plain that he thinks the collaborative, decentralized publishing process behind Wikipedia might just be the answer.
Wales made his intentions official at a March 8 news conference in Tokyo, where he said that his new for-profit company, Wikia, would lead a project to launch a community-driven, open-source search site by the end of 2007. While the technical workings of Wikia Search are still being debated, it’s clear that the project will combine fully robotic Web exploration, or “spidering,” with Web-based tools that are in the hands of humans–both volunteer editors, who will organize and highlight the best content, and average users, who will vote on the usefulness of each search result, thereby influencing how high these results rank in future searches.
Wikia, based in San Mateo, CA, will host the search site and collect the advertising revenue it generates, but Wales believes that the site can be designed and built mainly by volunteers, through open-source collaboration of the type that gave rise to the Linux operating system. More than 700 volunteer developers are “already hacking away” at the problem using test servers donated to Wikia by supporters, according to Wikia CEO Gil Penchina.
The science of search is still an arcane one. It takes a deep understanding of file systems, index architectures, hard disk performance, networked storage, and fast query-time ranking–not to mention thousands of servers and an extreme amount of bandwidth–to build and run a major search engine. But today, Penchina and Wales argue, there is enough expertise outside the walls of the major search companies to design a competitive open-source search engine–one that they project could attract millions of users daily and capture as much as 5 percent of the $7 billion market for search-related advertising.
All that’s left is to build it. Not only does the Wikia search engine not exist yet; the company is still gathering technical suggestions at the most basic level (as can be seen by browsing the project’s mailing-list archive).
Wikia Search volunteers–the equivalent of Wikipedia’s core community of contributors and editors–might have jobs similar to those of the category editors at the Open Directory Project, a human-edited Web directory hosted by AOL for which each volunteer is in charge of tracking Web resources on a specific subject. Alternatively, all subjects might be open to all editors, who could use a Wikipedia-like system to rank or annotate search results and track other people’s revisions (and reverse them in cases of vandalism).
End users, meanwhile, might be asked to give a “thumbs up” or “thumbs down” vote about each individual search result, the way they can at the collaborative news aggregator Digg. Or they might be asked to “tag” or add descriptive words to results, helping others find them later, in the style of the social search sites TagWorld and Prefound and the photo-sharing community Flickr.
“We have a lot of things developing in parallel, and some of our projects are actually competing with each other, so we’ll have to see what the outcome is,” says Penchina. That will take patience, but Wikia is in no hurry to finalize a solution, he says. “We have this saying internally: ‘No a priori thinking.’ Communities evolve in their own special way, and anyone who thinks they know where the crowd is going to go generally doesn’t understand crowd psychology.”
Only a few matters are settled, according to Penchina. One is the basic premise: that Wikia Search will combine the strengths of software and people. “Computers are useful for large-scale problem solving like building an index, but machine judgment is usually never as good as human judgment, so you need a blend of the two,” he says.
Another settled matter is that Wikia Search’s developers won’t attempt to reinvent the wheel: the purely algorithmic components of Wikia Search will be built on top of the existing open-source search engines Nutch and Lucene, both initiated by independent software developer Doug Cutting.
Reaction to the news of Wikia’s ambitions is mixed. Some in the technical community say that Internet users deserve a search engine whose workings are open for all to examine, in contrast to the closely guarded ranking algorithms used by Google and its peers.
Others have underscored the huge challenges in going up against the likes of Google, which employs many of the world’s best brains in information-retrieval technology, owns a vast global infrastructure of servers, and dishes up results good enough that more than one in four Internet users make a stop at the search engine every day. “Google and Yahoo and MSN and Ask do a pretty damned good job,” remarked search-industry veteran Stavros Macrakis in a late February post to the Wikia Search mailing list. “It’s not as though the competition was a $2,000 Encyclopaedia Britannica which is always years out of date.”
Penchina acknowledges the scale of the challenges but says Wikia is in the search business for the long haul. “I don’t know that we expect massively impressive results from day one,” he says. “Wikipedia has taken six years to get where it is.”
Wikia Search has a somewhat confusing genealogy. Wikipedia, which melded the idea of an online encyclopedia with the collaborative-editing technology of wikis, has been controlled since 2003 by the nonprofit Wikimedia Foundation, which also operates Wiktionary, Wikinews, Wikiquote, and other collaborative projects.
Wikia, on the other hand, is a for-profit company cofounded in 2004 by Wales and British Internet entrepreneur Angela Beeseley under the original name Wikicities. It hosts hundreds of special-interest wikis, including wikis for genealogy buffs. Wikia has no direct connection to Wikipedia; however, several of Wikipedia’s most dedicated contributors are now employees at Wikia, including Beeseley.
Wikia raised at least $4 million in venture capital in 2006 from a group including Bessemer Venture Partners, Omidyar Network, Amazon.com, and angel investors.
The big new idea for making self-driving cars that can go anywhere
The mainstream approach to driverless cars is slow and difficult. These startups think going all-in on AI will get there faster.
Inside Charm Industrial’s big bet on corn stalks for carbon removal
The startup used plant matter and bio-oil to sequester thousands of tons of carbon. The question now is how reliable, scalable, and economical this approach will prove.
The dark secret behind those cute AI-generated animal images
Google Brain has revealed its own image-making AI, called Imagen. But don't expect to see anything that isn't wholesome.
The hype around DeepMind’s new AI model misses what’s actually cool about it
Some worry that the chatter about these tools is doing the whole field a disservice.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.