Hello,

We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Buzz Meter

Data mining sheds light on what makes news.

Political scientists have long studied the news cycle, tracking which people and topics drive coverage and for how long. But the sheer volume of news outlets made it hard to quantify their results.

Researchers at Cornell University are trying to get a quantitative handle on how news stories proliferate. Computer scientist Jon ­Kleinberg reasoned that instead of trying to sort items from blogs and news sites into arbitrary categories, he could home in on quotes to identify their topics computationally. But references to a quote might extract different phrases from it, change its tense, or paraphrase it, resulting in dozens of different versions. So Kleinberg and his colleagues developed algorithms that determine family resemblances between strings of words in different articles.

The researchers are now canvassing about a million online news items a day. Focusing on quotes might exclude some relevant items, but it helps identify the types of stories that prove most popu­lar and the websites that report on them first. The researchers have found that with the exception of a handful of professional political blogs that are the fastest to sniff out a story, mainstream media sites drive coverage, converging on a story two and a half hours before blogs react. But mainstream sites are also quick to abandon stories, while blog interest can persist for days.

This story is part of our March/April 2009 Issue
See the rest of the issue
Subscribe
POP CHART This graph depicts the 50 phrases that generated the most buzz online in the last three months of the 2008 presidential campaign. The vertical axis indicates the number of Web items featuring some version of the phrase posted hourly; the horizontal axis shows fluctuation over time. Each phrase has an associated color, some labeled with phrase excerpts.
Multimedia

Pop chart: This graph depicts the 50 phrases that generated the most buzz online in the last three months of the 2008 presidential campaign. The vertical axis indicates the number of Web items featuring some version of the phrase posted hourly; the horizontal axis shows fluctuation over time. Each phrase has an associated color, some labeled with phrase excerpts.

Credit: Jure Leskovec, Lars Backstrom, and Jon Kleinberg

Become an MIT Technology Review Insider for in-depth analysis and unparalleled perspective.

Subscribe today
Want more award-winning journalism? Subscribe to Insider Basic.
  • Insider Basic {! insider.prices.basic !}*

    {! insider.display.menuOptionsLabel !}

    Six issues of our award winning print magazine, unlimited online access plus The Download with the top tech stories delivered daily to your inbox.

    See details+

    Print Magazine (6 bi-monthly issues)

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

/3
You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.