An AI Makes Some Sense of the World by Watching Videos Alone
DeepMind has developed software that forms links between activities and sounds in video through unsupervised learning. New Scientist reports that the firm's new AI uses three neural nets: one for image recognition, another for identifying sounds, and a third that ties results from the two together. But unlike many machine-learning algorithms, which are provided with labeled data sets to help them associate words with what they see or hear, this system was given a pile of raw data and left to fend for itself.
It was left alone with 60 million video stills, each of which came paired with a one-second audio clip taken from the same point in a video from where the frame was captured. Without human assistance, the system then slowly learned how sounds and image features were related—ultimately finding itself able to link, say, crowds with a cheer and typing hands with that familiar clickety-clack. It can’t yet put a word to any of its observations, but it is another step toward AIs being able to make sense of the world without constantly being told about what they see.
Keep Reading
Most Popular
Large language models can do jaw-dropping things. But nobody knows exactly why.
And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.
How scientists traced a mysterious covid case back to six toilets
When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.
The problem with plug-in hybrids? Their drivers.
Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.
Google DeepMind’s new generative model makes Super Mario–like games from scratch
Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.