MIT Technology Review Subscribe

A new AI creates original video clips from text cues

A short, typed description of a scene is enough to get this software making footage.

How it works: Science reports that the AI uses two neural networks—one to create video, another to assess if it’s realistic in order to improve the first’s output. We named these kinds of AIs one of our 10 Breakthrough Technologies of 2018.

Advertisement

What it does: First, the system is trained on footage of activities labelled with descriptions like “playing golf on grass.” It can then recreate similar scenes given a snippet of text. Plus, it can make clips combining disparate concepts from training data, such as “sailing on snow.”

This story is only available to subscribers.

Don’t settle for half the story.
Get paywall-free access to technology news for the here and now.

Subscribe now Already a subscriber? Sign in
You’ve read all your free stories.

MIT Technology Review provides an intelligent and independent filter for the flood of information about technology.

Subscribe now Already a subscriber? Sign in

Why it matters: Automatic generation of video from text could be incredibly useful—for creating huge sets of synthetic training data for autonomous cars, say. It could also lead to some worrying fake content too.

But: The clips are just 32 frames long and 64×64 pixels in size. They’re still not wholly convincing, and if they’re made larger, accuracy plummets. All that needs fixing to build a useful text-to-video converter.

This is your last free story.
Sign in Subscribe now

Your daily newsletter about what’s up in emerging technology from MIT Technology Review.

Please, enter a valid email.
Privacy Policy
Submitting...
There was an error submitting the request.
Thanks for signing up!

Our most popular stories

Advertisement