After several decades in the doldrums, AI is experiencing quite a renaissance. In recent years, amazing progress has been made using so-called deep learning, training algorithms with large amounts of data so that they can recognize subtle patterns. Such approaches have enabled computers to recognize faces in an image or the text of speech, often with eerily human accuracy.
It’s becoming clear, however, that fundamentally new approaches will be needed if machines are to go demonstrate more meaningful intelligence. One technique, being applied by a Silicon Valley startup called MetaMind, shows how adding novel memory capabilities to deep learning can produce impressive results when it comes to answering questions about the content of images. The company was founded by Richard Socher, a machine-learning expert who left an academic post at Stanford to found the company.
Socher’s creation uses what it calls a dynamic memory network (DMN) to enable computers to infer useful things from various inputs. These let a deep-learning system store and update facts as it parses more information. Previously the company showed how its system can feed on different sentences and figure out how to answer some fairly sophisticated questions that require inference. This ability has now been applied to answering questions about the contents of images.
As a piece on MetaMind in the New York Times explains, the results are quite basic, and nothing like as sophisticated as a human’s ability to understand what’s going on in images. But the technology shows how new approaches, especially ones that take inspiration from the way memory seems to work in biological brains, may hold the key to the next big step forward in AI.