Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo

 

Unsupported browser: Your browser does not meet modern web standards. See how it scores »

One of the interesting features of modern news coverage is the way that stories spread across the web. Many research teams have attempted to model this process, likening it among other things, to the spread of flu, fashions and forest fires.

One of the fundamental insights that these studies have produced is why these phenomenon spread in similar ways. These phenomenon do not share similar physical properties–a flu virus is not much like a burning leaf or a designer dress.  

What these things have in common is the networks on which they spread. It is their environment and the way it is linked together that determines how these events cascade. 

The fundamental insight here is that the network of links between trees is similar to the network of contacts between people and the network of links between websites. Because of this, the properties of one network can reasonably be assumed to apply to the others.

Today, Felix Biessmann at the Berlin Institute of Technology in Germany and a few pals study the problem of trend setting among news sites. The question they attempt to answer is which websites lead the news coverage and which ones merely follow.

Their approach is essentially to take a snapshot of the words generated by a group of websites at any instant in time and compare it to the words generated by one of these websites at an earlier time. 

This allows them to calculate whether the content of this single website is a good predictor of future content on other websites. In other words, whether it is a trend setter. They then rank the websites according to this metric.

The results are unsurprising. They monitored 96 technology news websites throughout 2011, a process that generated data on some 100,000 words (after common words had been removed).

This is their list of the top trendsetters in technology news coverage:

businessinsider
arstechnica
engadget
techcrunch
mashable
venturebeat
techdirt
theregister
forbes
guardian

That’s clearly a list of the biggest and most popular technology news sites on the web. 

One problem with this approach is that it fails to differentiate between news generated by current events, such as an earthquake, a product launch or the death of a well known figure like Steve Jobs, and news generated by old fashioned journalistic legwork, like an investigation into child labour abuses or financial irregularities. 

The difference being that a big earthquake would get significant media coverage whether or not a particular website covered it, whereas a journalistic expose only gets coverage because of the legwork performed by a particular website.  

This is related to another problem. One possibility is that the real trend setters may lie outside the group of 96 websites that these guys have monitored. For example, wires services such as Associated press and Reuters have a huge impact on the spread of news, and many of the bigger websites will have subscriptions to these services.

In this case, the trend setters are simply the ones who post the wires stories first or who post so many of them that they are first often enough to seem like trend setters.

Clearly there’s more work to be done in identifying trend setters. What’s interesting however is that the techniques these and others have developed will have wider application. Trend setters in  technology news play a similar role to the first victims in an epidemic who spread the disease, the match or lightning strike that triggers a forest fire and the fashionistas that set clothing trends. 

Much work has gone into identifying these too. Perhaps a similar process might help identify the websites that act as ‘matches’ on the web, those sites that trigger each new wave of news.

Ref: arxiv.org/abs/1206.6388: Canonical Trends: Detecting Trend Setters in Web Data

1 comment. Share your thoughts »

Tagged: Web

Reprints and Permissions | Send feedback to the editor

From the Archives

Close

Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me