A new direction in machine learning is giving computers the ability to daydream, and the results are fascinating and potentially pretty useful.
A system developed by researchers at Nvidia in Santa Clara, California, can look at an image of a sunny road and imagine, in stunning detail, what it would look like if it were raining, nighttime, or a snowy day. It can also imagine what a house cat would look like if it were a leopard, a lion, or a tiger.
The software makes use of a popular new approach in AI that lets computers learn without human help. The team used generative adversarial networks, or GANs, which are neural networks that work in tandem to learn the properties of a data set (see “Innovators Under 35: Ian Goodfellow”).
In a GAN, one neural network tries to produce synthetic data while another tries to tell if an example comes from the real data set or not. Feedback from the second network helps improve the performance of the first. The trick performed by the Nvidia team is to use two GANs trained on different but similar data, and to use similarities, or overlap, between the two trained models to dream up new imagery.
In the case of street images, for instance, one GAN was trained to internalize the properties of roads while the other was trained using images of nighttime, rainy, or snowy scenes. Connecting the two networks lets a computer imagine what a scene would look like in different conditions. A similar trick was performed with house cats and big cats (you can check out the full video here). The researchers are presenting the work at the Neural Information Processing Systems conference in Long Beach, California, this week. Here’s a paper (PDF) that describes the work.
“So far machine learning has been focused more on recognition,” says Ming-Yu Liu, who worked on the project with colleagues Thomas Breuel and Jan Kautz. “But humans can use their imagination. If I give you a photo in summertime, you can imagine what it will be like covered in snow.”
Liu says the technology could have practical applications in image and video editing, and for adding realistic effects to images and video posted to social networks. Imagine being able to post a live video showing you in a very realistic artificial setting, for example, or that convincingly converts your face into that of another person or an animal.
The approach could also prove useful for training self-driving systems to recognize more scenarios without having to collect a ridiculous amount of real-world data. “In California we don’t have a lot of snow, but we want our self-driving car to work well in the snow,” says Liu.
This new data poisoning tool lets artists fight back against generative AI
The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models.
Rogue superintelligence and merging with machines: Inside the mind of OpenAI’s chief scientist
An exclusive conversation with Ilya Sutskever on his fears for the future of AI and why they’ve made him change the focus of his life’s work.
Driving companywide efficiencies with AI
Advanced AI and ML capabilities revolutionize how administrative and operations tasks are done.
Unpacking the hype around OpenAI’s rumored new Q* model
If OpenAI's new model can solve grade-school math, it could pave the way for more powerful systems.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.