Technology Review - Published By MIT
Advertisement

arXiv blog

The Physics arXiv Blog produces daily coverage of the best new ideas from an online forum called the Physics arXiv on which scientists post early versions of their latest ideas. Contact me at KentuckyFC @ arxivblog.com

Email Subscription

Recently on the arXiv blog...

Recent comments on the arXivblog

  • royboy2019 : "Some foolish men declare that a Creator made the world. The doctrine that the world was created...
  • AKT : Let me ask you a question. What do you mean by Quantum Mechanics?   Many physicists in main...
  • rsanchez1 : As for me, I eagerly await his explanation for the Cosmic Microwave Background. The big bang...
  • ... : I was drawn to this discussion from Kurzweilai.net by the fact that the subject relates to my own...
  • smoker : What happens when light is red shifted so far that it stops being a wave form altogetherThat...
  • mvardon : Am I missing something here or are we exerting significant energies on a 'paradox' based on a...
  • ZephirAWT : The Top 30 Problems with the Big Bang http://www.metaresearch.org/cosmology/BB-top-30.aspRecently...
  • ZephirAWT : Actually there exists a number of models involving periodic Universe, some the similar one was...
  • B. Honest : What you are describing is very similar to the subatomic quantum fluctuations, where particles...
  • B. Honest : If our 'Universe' is ruled by quantum mechanics I think that, like subatomic particles that...
  • Taymon : This is my only objection to an otherwise good and well-written article--but it is a very serious...
  • ... : Agreed. Perhaps you and I have some differences in opinion on the strength of predictive modeling...
  • naasking : Say what you like, but it must have happened somewhere.Yes, it happened everywhere.Also, the CMB...
  • Netizen : The Big Bang is a problematic black box theory because it fails to answer what happened BEFORE it...
  • irjsiq : The 'Big Bang' never gained traction with my thinking . . . 'no beginning no end', though...
  • ... : I'm glad to see that Shu has put some flesh on those old dry bones. My preferred theory, also...
  • sfcwelch : rather narrow point of view, the speed of light has different speeds in different materials a...
  • luddite : Entropy is a quantifiable device, Which can measure the amount of matter in light, And the energy...
  • ZephirAWT : In dense aether theory Universe appear like fractal nested density fluctuation of hypothetical...
  • ZephirAWT : In dense aether theory the Universe does it simultaneously. It's entropy is decreasing above the...
Advertisement
Wednesday, February 17, 2010

Scientist Finds PageRank-Type Algorithm from the 1940s

Google's PageRank algorithm was developed in 1998. But a project to trace the history of such algorithms reveals an example from the 1940s.

The PageRank algorithm is a key part of Google's method of ranking web pages in search results. It uses the network of links between web pages to determine their value and, famously, judges a page to be important if it is linked to by other important pages.

One crucial feature of this idea is that it requires an iterative approach to constantly re-evaluate the value of a page as the importance of others varies. Iterative ranking algorithms have since become an important part of network theory.

PageRank was developed in 1998 by Google's founders Sergey Brin and Larry Page and its impact has been such that it's easy to forget that the approach was not entirely novel. Massimo Franceschet at the University of Udine in Italy points out that the idea has been successfully exploited a number of times in 20th century science, even before Brin and Page were born. Today, he presents a short history of iterative ranking algorithms and charts their evolution prior to Google's emergence.

He begins in reverse chronological order with the work of Jon Kleinberg, a computer scientist at Cornell University, who developed an almost identical approach to PageRank, just a few years earlier. Brin and Page even reference his work in their famous paper introducing PageRank.

Kleinberg called his algorithm Hypertext Induced Topic Search or HITS and it treated web pages as "hubs" and "authorities". It used the circular definition that authorities are pages that are pointed to by hubs and hubs are pages that point to authorities and requires an iterative approach to solve.

In the heady days of the dotcom boom in the late 20th century, before Google became so successful, Kleinberg's work received considerable media coverage.

Franceschet also examines the work of Gabriel Pinski and Francis Narin who developed a way of ranking journals. Their rule was that a journal is important if it is cited by other important journals. Like PageRank and HITS, this requires an iterative method to exploit the structure of links between journals to come up with a ranking.

Long before this, however, Charles H Hubbell at the University of Califronia , Santa Barbara, was analysing social networks in a similar way. In 1965, he published a technique for determining the importance of individuals based on the importance of the people who endorse them. This again has the characteristic circular definition and iterative solution. Hubbell is acknowledged by many including Kleinberg as a pioneer in iterative ranking theory.

But the big surprise is Franceschet's discovery of an even earlier forerunner to PageRank in the work of the Harvard economist Wassily Leontief. In 1941, Leontief published a paper in which he divides a country's economy into sectors that both supply and receive resources from each other, although not in equal measure. One important question is: what is the value of each sector when they are so tightly integrated? Leontief's answer was to develop an iterative method of valuing each sector based on the importance of the sectors that supply it. Sound familiar? In 1973, Leontief was awarded the Nobel Prize in economics for this work.

What's clear is that the ideas behind PageRank have a venerable history but the surprise is that they date back to at least the 1940s. It'll be interesting to see if anybody can find any similar work that predates this.

Ref:arxiv.org/abs/1002.2858: PageRank: Stand On The Shoulders Of Giants

Comments

  • Aristocracy
    Ever read anything by any of the Bronte sisters? All of aristocracy has functioned in the exact same manner as "PageRank" for hundreds, if not thousands, of years. One's social standing was based on who one knew. And their social standing was based on who they knew, and so on. All any of these researchers have done is codify what the Bronte sisters knew and clearly illustrated back in the 1800s.
    Rate this comment: 12345

    GrantRober...
    02/17/2010
    Posts:3
    Avg Rating:
    4/5
    • Re: Aristocracy
      When "all you do" is codify and formalize intuitive knowledge, you're doing a service to mankind.
      Rate this comment: 12345

      imd
      02/17/2010
      Posts:1
      Avg Rating:
      5/5
    • Re: Aristocracy
      So are you saying Sergey Brin and Larry Page are actually the Bronte sisters?
      Rate this comment: 12345

      bravid
      02/18/2010
      Posts:6
      Avg Rating:
      5/5
  • Almanachs
    The aristocracy almanacs ranked these from the 14th century on on same  basis as linked pages, who you knew and who they know and how important. The Bronte sisters had in in their mind but could also if wanted purchase an almanac
    Rate this comment: 12345

    eionmac
    02/27/2010
    Posts:1
  • This is a bit of a red herring
    Well, actually Leontief's theory just shows how to define a sustainable economy through the fixpoint of a linear operator. I see no connection with PageRank, except that PageRank is a fixpoint, too. But then Markov's work (1906) predates PageRank even more, as any work about fixed point of  linear operators. Rather, Seeley (1949) introduced reputation as a recursive concept using exactly PageRank's formulation, and is the oldest currently known instance of the idea. Kats (1953) and Wei (1952) did essentially the same using different starting points. They are psychologists and sociologists, but they can all be found in Wasserman and Faust's book about social networks (1994), which makes this news a bit late and quite a red herring. A rather detailed mathematical account can be found here: http://arxiv.org/abs/0912.0238
    Rate this comment: 12345

    vigna
    03/10/2010
    Posts:2
    • Heavily smoked (herring)?
      In fact, the connections are striking:

      a) A Leontief input-output table is nothing more than a matrix;
      b) Leontief closed system (which excludes exogenous demand and profit) is essentially Pinski and Narin bibliometric method, which is endorsed by Larry Page in PageRank patent;
      c) The stochastic reformulation of Leontief closed system (along the route that Geller traced for Pinski and Narin technique) is a weighted teleportation-free version of the PageRank algorithm.

      Furthermore:

      a) The solution to the stochastic reformulation of Leontief closed system is the leading eigenvector of a stochastic matrix (of normalized production coefficients), in analogy with the solution of the PageRank problem (the leading eigenvector of Google matrix);
      b) Individual solution scores correspond to total revenues of industries. In particular, the revenue of an industry B depends on the revenue of industries A that produce products for B weighted by the proportion of product that A produces for B. Hence, highly remunerated sectors are those that receive inputs from other highly remunerated industries with low propensity to differentiate their outputs among the other industries. Sounds familiar? It's PageRank logic!

      Details on: http://arxiv.org/abs/1002.2858

      Massimo Franceschet
      http://users.dimi.uniud.it/~massimo.franceschet/
      Rate this comment: 12345

      francesche...
      03/14/2010
      Posts:1
      • Re: Heavily smoked (herring)?
        These connections are very vague, and mostly related to the mathematical tools involved, which are quite common. They are no stronger, say, than those with Markov's work (1906!), as observed also by Massimo Marchiori (of HyperSearch fame). Seeley's paper, on the other hand, has the specific purpose of estimating reputation recursively starting from directed endorsement.
        Rate this comment: 12345

        vigna
        04/14/2010
        Posts:2
Advertisement

Log In

Forgot your password?     Register »
Advertisement
Technology Review July/August 2010

Current Issue

Can AIDS Be Cured?
Researchers are pursuing radical new strategies to eliminate HIV from the body.
•  Subscribe
Save 36%
•  Table of Contents
•  MIT News
» Gift Subscription
» Digital Subscription
» Reprints, Back Issues
» Subscribe
» Table of Contents
» MIT News

More Technology News from Forbes

Advertisement
MIT Massachusetts Institute of Technology © 2010 Technology Review. All Rights Reserved.