Nvidia’s Tag-Teaming AIs Imagine Night as Day, and House Cats as Tigers

Will Knightarchive page

December 4, 2017

A new direction in machine learning is giving computers the ability to daydream, and the results are fascinating and potentially pretty useful.

A system developed by researchers at Nvidia in Santa Clara, California, can look at an image of a sunny road and imagine, in stunning detail, what it would look like if it were raining, nighttime, or a snowy day. It can also imagine what a house cat would look like if it were a leopard, a lion, or a tiger.

The software makes use of a popular new approach in AI that lets computers learn without human help. The team used generative adversarial networks, or GANs, which are neural networks that work in tandem to learn the properties of a data set (see “Innovators Under 35: Ian Goodfellow”).

In a GAN, one neural network tries to produce synthetic data while another tries to tell if an example comes from the real data set or not. Feedback from the second network helps improve the performance of the first. The trick performed by the Nvidia team is to use two GANs trained on different but similar data, and to use similarities, or overlap, between the two trained models to dream up new imagery.

In the case of street images, for instance, one GAN was trained to internalize the properties of roads while the other was trained using images of nighttime, rainy, or snowy scenes. Connecting the two networks lets a computer imagine what a scene would look like in different conditions. A similar trick was performed with house cats and big cats (you can check out the full video here). The researchers are presenting the work at the Neural Information Processing Systems conference in Long Beach, California, this week. Here’s a paper (PDF) that describes the work.

“So far machine learning has been focused more on recognition,” says Ming-Yu Liu, who worked on the project with colleagues Thomas Breuel and Jan Kautz. “But humans can use their imagination. If I give you a photo in summertime, you can imagine what it will be like covered in snow.”

Liu says the technology could have practical applications in image and video editing, and for adding realistic effects to images and video posted to social networks. Imagine being able to post a live video showing you in a very realistic artificial setting, for example, or that convincingly converts your face into that of another person or an animal.

The approach could also prove useful for training self-driving systems to recognize more scenarios without having to collect a ridiculous amount of real-world data. “In California we don’t have a lot of snow, but we want our self-driving car to work well in the snow,” says Liu.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.