A new direction in machine learning is giving computers the ability to daydream, and the results are fascinating and potentially pretty useful.
A system developed by researchers at Nvidia in Santa Clara, California, can look at an image of a sunny road and imagine, in stunning detail, what it would look like if it were raining, nighttime, or a snowy day. It can also imagine what a house cat would look like if it were a leopard, a lion, or a tiger.
The software makes use of a popular new approach in AI that lets computers learn without human help. The team used generative adversarial networks, or GANs, which are neural networks that work in tandem to learn the properties of a data set (see “Innovators Under 35: Ian Goodfellow”).
In a GAN, one neural network tries to produce synthetic data while another tries to tell if an example comes from the real data set or not. Feedback from the second network helps improve the performance of the first. The trick performed by the Nvidia team is to use two GANs trained on different but similar data, and to use similarities, or overlap, between the two trained models to dream up new imagery.
In the case of street images, for instance, one GAN was trained to internalize the properties of roads while the other was trained using images of nighttime, rainy, or snowy scenes. Connecting the two networks lets a computer imagine what a scene would look like in different conditions. A similar trick was performed with house cats and big cats (you can check out the full video here). The researchers are presenting the work at the Neural Information Processing Systems conference in Long Beach, California, this week. Here’s a paper (PDF) that describes the work.
“So far machine learning has been focused more on recognition,” says Ming-Yu Liu, who worked on the project with colleagues Thomas Breuel and Jan Kautz. “But humans can use their imagination. If I give you a photo in summertime, you can imagine what it will be like covered in snow.”
Liu says the technology could have practical applications in image and video editing, and for adding realistic effects to images and video posted to social networks. Imagine being able to post a live video showing you in a very realistic artificial setting, for example, or that convincingly converts your face into that of another person or an animal.
The approach could also prove useful for training self-driving systems to recognize more scenarios without having to collect a ridiculous amount of real-world data. “In California we don’t have a lot of snow, but we want our self-driving car to work well in the snow,” says Liu.
Geoffrey Hinton tells us why he’s now scared of the tech he helped build
“I have suddenly switched my views on whether these things are going to be more intelligent than us.”
Deep learning pioneer Geoffrey Hinton has quit Google
Hinton will be speaking at EmTech Digital on Wednesday.
The future of generative AI is niche, not generalized
ChatGPT has sparked speculation about artificial general intelligence. But the next real phase of AI will be in specific domains and contexts.
Video: Geoffrey Hinton talks about the “existential threat” of AI
Watch Hinton speak with Will Douglas Heaven, MIT Technology Review’s senior editor for AI, at EmTech Digital.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.