Inside the world of AI that forges beautiful art and terrifying deepfakes

Generative adversarial networks, or GANs, are fueling creativity—and controversy. Here’s how they work.

Karen Haoarchive page

December 1, 2018

Nvidia

In the last three weeks, we laid down the basics of AI. To recap:

Most AI advances and applications are based on a type of algorithm known as machine learning that finds and reapplies patterns in data.
Deep learning, a powerful subset of machine learning, uses neural networks to find and amplify even the smallest patterns.
Neural networks are layers of simple computational nodes that work together to analyze data, kind of like neurons in the human brain.

Now we get to the fun part. Using one neural network is really great for learning patterns; using two is really great for creating them. Welcome to the magical, terrifying world of generative adversarial networks, or GANs.

GANs are having a bit of a cultural moment. They are responsible for the first piece of AI-generated artwork sold at Christie’s, as well as the category of fake digital images known as “deepfakes.”

Their secret lies in the way two neural networks work together—or rather, against each other. You start by feeding both neural networks a whole lot of training data and give each one a separate task. The first network, known as the generator, must produce artificial outputs, like handwriting, videos, or voices, by looking at the training examples and trying to mimic them. The second, known as the discriminator, then determines whether the outputs are real by comparing each one with the same training examples.

Each time the discriminator successfully rejects the generator’s output, the generator goes back to try again. To borrow a metaphor from my colleague Martin Giles, the process “mimics the back-and-forth between a picture forger and an art detective who repeatedly try to outwit one another.” Eventually, the discriminator can’t tell the difference between the output and training examples. In other words, the mimicry is indistinguishable from reality.

You can see why a world with GANs is equal measures beautiful and ugly. On one hand, the ability to synthesize media and mimic other data patterns can be useful in photo editing, animation, and medicine (such as to improve the quality of medical images and to overcome the scarcity of patient data). It also brings us joyful creations like this:

#BigGAN is so much fun. I stumbled upon a (circular) direction in latent space that makes party parrots, as well as other party animals: pic.twitter.com/zU1mCh9UBe
— Phillip Isola (@phillip_isola) November 25, 2018

And this:

On the other hand, GANs can also be used in ethically objectionable and dangerous ways: to overlay celebrity faces on the bodies of porn stars, to make Barack Obama say whatever you want, or to forge someone’s fingerprint and other biometric data, an ability researchers at NYU and Michigan State recently showed in a paper.

Fortunately, GANs still have limitations that put some guard rails in place. They need quite a lot of computational power and narrowly scoped data to produce something truly believable. In order to produce a realistic image of a frog, for example, such a system needs hundreds of images of frogs from a particular species, preferably facing a similar direction. Without those specifications, you get some really wacky results, like this creature from your darkest nightmares:

ok these #BIGGAN results are incredible. #nature should take a hint. eyes distributed around the head is a winner #BIGGAN pic.twitter.com/hJBb3fUQ78
— memo akten (@memotv) September 30, 2018

(You should thank me for not showing you the spiders.)

But experts worry that we’ve only seen the tip of the iceberg. As the algorithms get more and more refined, glitchy videos and Picasso animals will become a thing of the past. As Hany Farid, a digital image forensics expert, once told me, we’re poorly prepared to solve this problem.

This originally appeared in our AI newsletter The Algorithm. To have it delivered directly to your inbox, subscribe here for free.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

Will Douglas Heavenarchive page

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Will Douglas Heavenarchive page

What’s next for generative video

OpenAI's Sora has raised the bar for AI moviemaking. Here are four things to bear in mind as we wrap our heads around what's coming.

Will Douglas Heavenarchive page

The AI Act is done. Here’s what will (and won’t) change

The hard work starts now.

Melissa Heikkiläarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Inside the world of AI that forges beautiful art and terrifying deepfakes

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review