Skip to Content
Uncategorized

BlogRank v PageRank

The flaw behind the latest approach to blog search

Use a search engine to hunt for a decent blog on an unusual topic and the chances are that you’ll find little of real value.

The problem is that blogs are not knitted together with hyperlinks in the same way as the rest of the web, which is why traditional search algorithms do a poor job of ranking them. For example, the famous Google algorithm, PageRank, works by counting the number of links that point to a webpage. The more a page receives, the higher it is ranked. But it also assumes that links from popular sites are more valuable, so a ranking is also weighted according to the popularity of the linking pages.

That works pretty well for ordinary web pages but fails miserably for most blogs because the links they receive offer little indication of their quality or content. For example, there is no way to distinguish between a link made with pleasure or in anger. And blogs generally receive fewer links than other types of page, even when they are widely read and influential among a specific but small group of people.

So how to improve blog search? Apostolos Kritikopoulos and pals at the Athens University of Economics and Business in Greece say the key is to bolster the PageRank approach with other information about how blogs may be related. They say for example that it is possible to expoit the blog’s topic as determined by tags and also to exploit the contribution made by authors and commenters and the way in which these individuals add content to more than one blog.

Taking these factors into consideration radically changes the kind of rankings that a blog search throws up. Kritikopoulos and co show how different their approach is by ranking the top 1000 blogs in a dataset from 2006 using PageRank and then using their own algorithm called BlogRank. These lists (and another ranking method) share only 139 common entries.

The top 3 PageRanked blogs are:

  1. BoingBoing
  2. Engadget
  3. grahame.livejournal.com/

The top 3 BlogRank blogs are:

  1. nocapital.blogspot.com/
  2. pseudomanitou.livejournal.com/
  3. tbogg.blogspot.com/

What this tells us about how BlogRank would work for actual keyword searches isn’t clear.

But it is apparent that the team will have its work cut out getting the kind of information that it claims gives BlogRank an advantage. For example, it’s not always possible to get hold of a full list of commenters on many blogs. And even when it is possible, identifying authors and commenters is tricky because they often hide their identity (ahem).

That may be BlogRank’s fatal failing.

Kritikopoulos and co conclude that their experimental results are “quite encouraging”. Maybe. They also say: “Much more experimental evaluation of our method, as well as tuning of its parameters is needed.”

That’s for sure. It may be some time before we get a search engine that does half as good a job for blogs as it does for the rest of the web.

Ref: arxiv.org/abs/0903.4035: BlogRank: Ranking Weblogs Based on Connectivity and Similarity Features

Deep Dive

Uncategorized

Our best illustrations of 2022

Our artists’ thought-provoking, playful creations bring our stories to life, often saying more with an image than words ever could.

How CRISPR is making farmed animals bigger, stronger, and healthier

These gene-edited fish, pigs, and other animals could soon be on the menu.

The Download: the Saudi sci-fi megacity, and sleeping babies’ brains

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. These exclusive satellite images show Saudi Arabia’s sci-fi megacity is well underway In early 2021, Crown Prince Mohammed bin Salman of Saudi Arabia announced The Line: a “civilizational revolution” that would house up…

10 Breakthrough Technologies 2023

Every year, we pick the 10 technologies that matter the most right now. We look for advances that will have a big impact on our lives and break down why they matter.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.