Twitter’s Artificial Intelligence Knows What’s Happening in Live Video Clips

Twitter has been developing technology that automatically recognizes what’s happening in live video, a step toward sophisticated recommendations.

Will Knightarchive page

April 28, 2016

Right now, someone somewhere is live-streaming something interesting. Thanks to technology being developed by a team of artificial intelligence researchers at Twitter, you may soon be able to find it.

Live-streaming is becoming ever-more popular through smartphone apps such as Periscope from Twitter, Meerkat, and, most recently, Facebook Live. But live video content usually isn’t tagged or categorized well, often because people don’t know what they’ll record until the camera begins rolling.

Twitter’s AI team, known as Cortex, has developed an algorithm that can instantly recognize what’s happening in a live feed. The algorithm can tell, for instance, if the star of a clip is playing guitar, demoing a power tool, or is actually a cat hamming it up for viewers.

“Content is always changing on Periscope, and more generally on live videos,” says Clement Farabet, who is the technology lead for Cortex. Farabet demonstrated the video-recognition technology to MIT Technology Review, showing a screen of about two dozen Periscope feeds, all being tagged in real-time.

Identifying the content of live video is a pretty impressive trick. Researchers have made impressive progress in recent years with algorithms that can identify objects in photographs, but it is much more difficult to do with a live video of varying quality. To do it instantly also requires considerable computing power. Twitter effectively built a custom supercomputer made entirely of graphics processing units (GPUs) to perform the video classification and serve up the results. These chips are especially efficient for the mathematical calculations required for deep learning, but normally they are just one part of a larger computer system.

“It is quite a challenge even for static videos, and for runtime videos they must have a lot of processing power,” says Peter Brusilovsky, a professor at the University of Pittsburgh who studies the personalization of content.

Brusilovsky says that better ways of filtering video are badly needed. “Videos generally aren’t skimmable,” he says. “As a result, recommendation is very important. It’s kind of the missing piece of video.”

Recommending videos usually involves showing a person clips that have been watched by someone else who seems to have similar taste (an approach known as “collaborative filtering”). This is a crude gauge of real interest, though, and it does not work for content that is being broadcast live.

The Cortex team has ambitions to develop a sophisticated recommendation system to help filter and curate all sorts of content shared through the service, based on a user’s previous activity.

The video-recognition technology developed by the Cortex team hasn’t yet made it into any of Twitter’s products, but it’s being tested on Periscope, an app owned by Twitter that lets users transmit live video from their smartphones. The team is using an approach known as deep learning to recognize the activity in clips. Deep learning involves training a large simulated neural network to recognize inputs from a large number of examples. The examples are provided by staff paid to watch videos and add keywords. This tagging process provides a fairly complex semantic understanding of video clips. For example, a video showing a cat may be categorized not just with “cat” but also “animal,” “feline,” “mammal,” and more. This offers a more sophisticated way to explore clips.

Live video is rapidly becoming an important part of the social media landscape.

Twitter acquired Periscope in January 2015, before the app had even launched, for a sum reportedly in excess of $50 million. This followed the breakout success of Meerkat, another app tied to Twitter. Facebook launched its own live video service earlier in 2015, and the company increased the feature’s prominence earlier this month by adding it to the homepage each user sees.

There are no plans as of yet to monetize the technology, and Periscope does not currently feature advertising. But it isn’t hard to imagine how such a tool could be useful for advertising, by algorithmically matching ads to the contents of videos as they are filmed and broadcast. As more and more video moves online, in fact, the algorithm could help Twitter tailor ads to such content a lot more efficiently. Notably, this month the company won the right to broadcast live certain NFL footage.

Ben Edelman, an associate professor at Harvard’s Berkman Center and an expert on online media and advertising, says the technique developed by Twitter could prove important for filtering out copyrighted content as well as undesirable content such as pornography or violence.

But Farabet is just as interested in finding stuff people do want to see. “Having an ability to truly understand what you’re interested in—completely independent of who produced it or when it was produced—is a fundamental capability that we really want to have,” he says.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.