Hello,

We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Emerging Technology from the arXiv

A View from Emerging Technology from the arXiv

First Measurement of 'Wordquakes' Shaking the Blogosphere

Certain words disrupt the blogosphere in the same way that earthquakes shake the planet. And that makes them ripe for an earthquake-like magnitude rating.

  • February 14, 2011

In 2007, when the search engine Technorati stopped counting, over 100 million blogs had appeared on the web. In a little over ten years, blogs have changed the nature of publishing.

So it’s no surprise that blogs have become the focus of intense study by scientists hoping to gain some insight into the nature of the creatures that produce them. (A blog, of course, is a web page with entries listed in reverse chronological order, maintained by one or more writers.)

Today, Peter Klimek at the Medical University of Vienna and a couple of buddies say there is a remarkable analogy between the way topics erupt into the blogosphere and how earthquakes rupture the planet.

These guys studied over 160 political blogs published in the US between 1 July 2008 and 3 May 2010. Each day, they counted the number of occurrences of every possible letter triplet ie aaa, aab, aac…zzz. (There are some 26^3=17576 triplets but more than half of these never occur.)

They then looked for the day on which each triplet was most common and listed the words in which they occurred. They then searched their database for occurrences of these words for the 30 days before and after the peak.

Klimek and co say this clearly shows two types of event. The first is a sudden spike in word frequency triggered by a news event such as the nomination of Sarah Palin as vice presidential candidate. Because these events are triggered from outside the blogosphere, Klimek and co call them exogenous

The second was gradual spike in which the discussion within the blogosphere reaches a crescendo and then dies away again. The use of the word inauguration before and after the inauguration of President Obama is an example, which Klimek and co call endogenous because they arise within the blogosphere.

The main finding is that the distribution of event sizes and of fore-and after shocks is remarkably similar to those found by seismologists. “The intensity of fore- and aftershocks follows Omori’s law, the distribution of event-sizes is of Gutenberg-Richter type,” say Klimek and co.

During the 670 days they were monitoring these bogs, they found over 1000 events, more than one wordquake per day.

In some ways, that’s not surprising. Word frequencies in most languages are known to follow power law distributions similar to those that govern earthquakes. Most of these studies have been done on snapshots of the language, the corpus of words in Wikipedia or the words in the complete works of certain authors for example. What Klimek and co are looking at is the way this measure changes in time.

What’s more interesting is the possibility of grading wordquakes in real time using a magnitude system, just as seismologist do for earthquakes.

“One might also think of a ‘Richter scale’ for media events,” say Klimek and co. They say largest event in their dataset is the nomination of Sarah Palin as vice presidential candidate.

This would be equivalent to the Big One hitting San Francisco or Tokyo. “Indeed, aftershocks of this event are still trembling and quivering through our society,” says Klimek and co.

Wordquakes given a magnitude might have exotic properties because of feedback effects. The very fact that a media event was labelled a Big One would generate interest that makes the quake even bigger.

This may or may not make wordquakes qualitatively different from earthquakes; we’ll have to see.

And a magnitude rating has further potential. Many human activities are known to follow earthquake-like power laws–epidemics, wars and fashions to name just a few.

Using the blogosphere, and indeed the Twitterverse, to give a magnitude to these trends might turn out to be a useful and popular way of rating them.

An interesting project for an innovative web start up.

Ref: arxiv.org/abs/1102.2091: The Blogosphere As An Excitable Social Medium: Richter’s And Omori’s Law In Media Coverage

Tech Obsessive?
Become an Insider to get the story behind the story — and before anyone else.

Subscribe today
Want more award-winning journalism? Subscribe and become an Insider.
  • Insider Plus {! insider.prices.plus !}* Best Value

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    Print + Digital Magazine (6 bi-monthly issues)

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

    Technology Review PDF magazine archive, including articles, images, and covers dating back to 1899

    10% Discount to MIT Technology Review events and MIT Press

    Ad-free website experience

  • Insider Basic {! insider.prices.basic !}*

    {! insider.display.menuOptionsLabel !}

    Six issues of our award winning print magazine, unlimited online access plus The Download with the top tech stories delivered daily to your inbox.

    See details+

    Print Magazine (6 bi-monthly issues)

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

  • Insider Online Only {! insider.prices.online !}*

    {! insider.display.menuOptionsLabel !}

    Unlimited online access including articles and video, plus The Download with the top tech stories delivered daily to your inbox.

    See details+

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

/3
You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.