Business Impact

Opening Search to Semantic Upstarts

Yahoo’s new open-search platform is giving semantic search a helping hand.

Even if you have a great idea for a new search engine, it’s far from easy to get it off the ground. For one thing, the best engineering talent resides at big-name companies. Even more significantly, according to some estimates, it costs hundreds of millions of dollars to buy and maintain the servers needed to index the Web in its entirety.

However, Yahoo recently released a resource that may offer hope to search innovators and entrepreneurs. Called Build Your Own Search Service (BOSS), it allows programmers to make use of Yahoo’s index of the Web–billions of pages that are continually updated–thereby removing perhaps the biggest barrier to search innovation. By opening its index to thousands of independent programmers and entrepreneurs, Yahoo hopes that BOSS will kick-start projects that it lacks the time, money, and resources to invent itself. Prabhakar Raghavan, head of Yahoo Research and a consulting professor at Stanford University, says this might include better ways of searching videos or images, tools that use social networks to rank search results, or a semantic search engine that tries to understand the contents of Web pages, rather than just a collection of keywords and links.

“We’re trying to break down the barriers to innovation,” says Raghavan, although he admits that BOSS is far from an altruistic venture. If a new search-engine tool built using Yahoo’s index becomes popular and potentially profitable, Yahoo reserves the right to place ads next to its results.

So far, no BOSS-powered site has become that successful. But a number of startups are beginning to build their services on top of BOSS, and Semantic Web companies, in particular, are benefiting from the platform. These companies are developing software to process concepts and meanings in order to better organize information on the Web.

For instance, Hakia, a company based in New York, began building a semantic search engine in 2004. Its algorithms use a database of concepts–people, places, objects, and more–to “understand” concepts in documents. Hakia also creates maps linking together different documents, such as Web pages, based on these concepts in order to understand their relevance to one another. Riza Berkan, CEO of the company, says that focusing on the meaning of pages, instead of simply on the links between them, could serve up more relevant search results and help people find content that they didn’t even know they were looking for.

However, in order to do this well, Hakia needs to have access to as many Web pages as possible, and this is where BOSS fits in. For a given query, Hakia uses Yahoo’s BOSS index to determine a set of relevant results. Hakia’s software then determines whether these pages have already been analyzed by the company’s semantic software. If they haven’t, they will be processed, and the results will be stored on Hakia’s servers. “We crawl the Web anyway,” says Berkan. “But without Yahoo’s index, we’d be behind on the sites that people are searching for today.” And the more popular pages Hakia scans, the better its index will be.

Another semantic startup, called Cluuz, from Ontario, Canada, is taking a slightly different approach. When a user searches with Cluuz, she will see Yahoo BOSS results, but they are reordered according to the startup’s own semantic search technology. “When you do a query,” says Alex Zivkovic, CTO of Cluuz, “we pass it on to Yahoo BOSS, and we get a list of results back … Then for each of those pages, the Cluuz engine analyzes the content, extracts entities–people, companies, phone numbers, and those sorts of things.” These concepts, he explains, are then checked against the concepts found on other pages, and the concepts that arise most often are deemed most relevant.

“Instead of looking at pages being linked based on the physical links, we’re looking at them in terms of whether or not they are talking about the same concepts,” says Zivkovic. This leads to a different user experience, he adds. For instance, terms relevant to a search query are pulled from the Web and highlighted on the right of the results page. A search for “Kate Greene” immediately pulls up my e-mail address at Technology Review, the university I attended, and a number of the people I’ve interviewed for past stories. Additionally, Cluuz provides other tools that allow the links and relationships between different semantic concepts to be visualized easily.

Even with the power of Yahoo’s index behind a company, there’s no guarantee that Hakia or Cluuz will be a success. But if they do take off, it could help Yahoo, which still lags way behind Google in terms of popularity, regain the edge. “The underlying philosophy [with BOSS] is, we’re not going to be able to invent everything on our own,” says Raghavan. “So we should facilitate innovation.”

Tech Obsessive?
Become an Insider to get the story behind the story — and before anyone else.

Subscribe today

Uh oh–you've read all of your free articles for this month.

Insider Premium
$179.95/yr US PRICE

More from Business Impact

How technology advances are changing the economy and providing new opportunities in many industries.

Want more award-winning journalism? Subscribe and become an Insider.
  • Insider Premium {! insider.prices.premium !}*

    {! insider.display.menuOptionsLabel !}

    Our award winning magazine, unlimited access to our story archive, special discounts to MIT Technology Review Events, and exclusive content.

    See details+

    What's Included

    Bimonthly magazine delivery and unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Access to the magazine PDF archive—thousands of articles going back to 1899 at your fingertips

    Special discounts to select partner offerings

    Discount to MIT Technology Review events

    Ad-free web experience

    First Look: exclusive early access to important stories, before they’re available to anyone else

    Insider Conversations: listen in on in-depth calls between our editors and today’s thought leaders

  • Insider Plus {! insider.prices.plus !}* Best Value

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus ad-free web experience, select discounts to partner offerings and MIT Technology Review events

    See details+

    What's Included

    Bimonthly magazine delivery and unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Access to the magazine PDF archive—thousands of articles going back to 1899 at your fingertips

    Special discounts to select partner offerings

    Discount to MIT Technology Review events

    Ad-free web experience

  • Insider Basic {! insider.prices.basic !}*

    {! insider.display.menuOptionsLabel !}

    Six issues of our award winning magazine and daily delivery of The Download, our newsletter of what’s important in technology and innovation.

    See details+

    What's Included

    Bimonthly magazine delivery and unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

/
You've read all of your free articles this month. This is your last free article this month. You've read of free articles this month. or  for unlimited online access.