Technology Review - Published By MIT
Advertisement

Surprising Search Patterns

A new study questions the common assumption that search engines control the hierarchy of the Internet.

By Kate Greene

Friday, August 18, 2006

smaller text tool iconmedium text tool iconlarger text tool icon

Conventional wisdom says that search engines are a fundamentally unfair technology -- favoring the most popular sites and helping them to become even more popular. This assumption, captured in the term "Googlearchy," is now being challenged by researchers at Indiana University who have used real-life data to test it. Their results show that Web-surfing behavior isn't as influenced by search-engine rankings as was previously thought.

This plot of red splotches, which represent popular websites, illustrates the idea that websites at the top of search engine results become increasingly popular, a concept known as “Googlearchy.” (Credit: Filippo Menczer and the trustees of Indiana University)

Understanding the impact of search engines isn't just an academic undertaking, says Filippo Menczer, professor of informatics and computer science at the University of Indiana in Bloomington. It has implications for creating online advertising models based on search results, building better search engines, devising online political campaigns, and understanding how people use the Internet. "Search engines have become the gateways between people and information," he says. "If a search engine has a bias, it has a huge impact because it can direct people to one sort of information and not another."

Search engines rank and list pages by popularity, a feature measured, in part, by how well-connected a page is to the rest of the Web. The more pages linking to a certain page, the higher that page will rank. Since these highly ranked sites are easier to find through a search, they will continue to get more hits. "The more popular sites get more and more links and new sites have no hope," says Menczer.

The researchers created two extreme Web-browsing models: a person who used only search engines to find content and a person who browsed without search engines, instead following links from one page to another. The researchers then compared these two models with real-life data about site traffic for Web pages and the number of links pointing to those pages.

They expected the real-world data to fall somewhere between the two extremes: targeted searching and haphazard surfing. Instead, it turned out that typical Web use -- presumably a combination of searching and surfing -- concentrated less on popular Web sites than either model had predicted. In other words, real-world Web searching does not fuel the Googlearchy nor does it keep less-popular sites from being found. "This was not what we expected and we were surprised by it," says Menczer.

Comments

  • Is this really a surprise?
    If I'm looking for something very basic - exchange rates, or a newly released book - I'll go with the first couple of hits off a search engine.  If I'm searching for something very specific I don't want to see that I've got 400,000 pages that match my query.  I'm going to keep on adding terms until I'm down to a few thousand pages.  I would say that what you have in your first few lines on a page (or your data lines if those show up) have far more impact on whether or not I click through to your site then where you appear on the Google listing.  It's common for me, and most of my friends, to click through 10-12 pages of links before hitting on the few that I think will work for me.  I'm in fact far _less_ likely to click on the first one since overall those tend to be commercial sites.
    Rate this comment: 12345

    deirdrebeth
    08/18/2006
    Posts:25
    Avg Rating:
    3/5
    • Re: Is this really a surprise?
      If pages are ranked purely by PageRank (or some popularity measure), it is easy to see that Googlearchy will exist. However, search engines first pick up pages that match the queries and then rank the filtered pages by PageRank (plus many other parameters), then pages at the top are not necessarily popular globally, but they tend to be more popular than other pages within the filtered group. Then, googlearchy exists only for individual queries, which obviously is less of a problem.
      By the way, PageRank is based on the popularity of pages not websites. Also, authority is a better term than popularity, which seems to imply that authors link to pages without much thinking or editorial judgment.
      Rate this comment: 12345

      diklee
      08/18/2006
      Posts:1

Log In

Forgot your password?     Register »
Advertisement

Videos

Malleable Maps, Artistic Robots and Bubble Interfaces
Technology Review January/February 2010

Current Issue

Security in the Ether
Information technology's next grand challenge will be to secure the cloud--and prove we can trust it.
Advertisement
Advertisement
Advertisement
Subscribe to Technology Review's daily e-mail update. Enter your e-mail address

TECHNOLOGY RESOURCES
Advertisement
MIT Massachusetts Institute of Technology © 2010 Technology Review. All Rights Reserved.