A View from Emerging Technology from the arXiv
The Evolution of Automated Breaking News Stories
A Google engineer has developed an algorithm that spots breaking news stories on the Web and illustrates them with pictures. And it is now filing its first stories on Twitter.
Breaking news stories are one of the driving forces for online media. So the ability to automatically spot interesting or important new events that are happening now is hugely valuable.
Last year, Thomas Steiner at Google Germany, in Hamburg, released just such an algorithm that can spot breaking news events as they happen. Today, he’s updated it with a picture-based interface that attempts to tell the stories behind the news events that the algorithm has spotted.
The process of automatically spotting breaking news events is relatively straightforward. It is based on the idea that if something important is happening now, Wikipedia editors working in different languages will update the relevant pages at the same time.
Wikipedia and its sister site Wikidata publish all edits using the Wikimedia IRC server. This allows all interested parties to monitor edits as they happen. Steiner’s algorithm simply monitors this feed looking for the simultaneous activity that is the signature of breaking news.
He called this application the Wikipedia Live Monitor and released it last year and it has successfully identified numerous breaking news stories, such as the Boston Marathon bombings and the more recent loss of Malaysia Airlines flight MH370.
Now Steiner has added a visual element to this process. This is based on another application that he and others have developed that searches social media for images associated with a particular search term. It then extracts any visual media, removes duplicates and then crops the images so that they fit together in a grid. He calls this application Social Media Illustrator.
So Steiner’s new service uses the output from the Wikimedia Live Monitor as the input search term for Social Media Illustrator. The result is a set of images associated with the breaking news event organized in a grid. Steiner’s assumption is that these images somehow tell the story behind the news event.
He publishes these images on Twitter at https://twitter.com/mediagalleries.
In today’s paper, Steiner presents the results of a test of the new system carried out during the 2014 Winter Olympics. The idea is that when a particular event ends, the winners are recorded on Wikipedia simultaneously in various very different languages. This triggers the breaking news algorithm to send the names of those winners to Social Media Illustrator which creates a grid of images associated with these individuals.
The results are curious. Steiner has asked a number of independent viewers to rank the relevance of the pictures that are generated in this process and they agree that most of the time, the images are relevant.
Whether they tell the story behind the news event is another question. On this evidence the answer is sometimes but often not.
A quick glance at the twitter feed reveals where more work is needed. One problem is that in many cases, it is not at all clear what breaking news stories the images refer to. Neither are the images generated in the media gallery hyperlinked, so it’s not possible to click through and see where they came from. What’s more, the images need to be cropped so they fit together in a grid but this often results in important information being lost, such as captions being cropped.
That’s not to say that the approach doesn’t have potential. There is a growing interest in the automated production of news, and algorithms now exist that van do this in some circumstances with varying degrees of success. It’s quite possible that some of the news we consume in the future will be spotted, evaluated and written and illustrated by an algorithm.
So far, these algorithms are relatively crude and human journalists generally do a significantly better job. So in the short term at least, human journalists seem safe.
Whether they can be taken out of the loop entirely is a bigger question. On this evidence, not in the short term.
Ref: arxiv.org/abs/1403.4289: Telling Breaking News Stories from Wikipedia with Social Multimedia: A Case Study of the 2014 Winter Olympics