After several decades in the doldrums, AI is experiencing quite a renaissance. In recent years, amazing progress has been made using so-called deep learning, training algorithms with large amounts of data so that they can recognize subtle patterns. Such approaches have enabled computers to recognize faces in an image or the text of speech, often with eerily human accuracy.
It’s becoming clear, however, that fundamentally new approaches will be needed if machines are to go demonstrate more meaningful intelligence. One technique, being applied by a Silicon Valley startup called MetaMind, shows how adding novel memory capabilities to deep learning can produce impressive results when it comes to answering questions about the content of images. The company was founded by Richard Socher, a machine-learning expert who left an academic post at Stanford to found the company.
Socher’s creation uses what it calls a dynamic memory network (DMN) to enable computers to infer useful things from various inputs. These let a deep-learning system store and update facts as it parses more information. Previously the company showed how its system can feed on different sentences and figure out how to answer some fairly sophisticated questions that require inference. This ability has now been applied to answering questions about the contents of images.
Don’t settle for half the story.
Get paywall-free access to technology news for the here and now.