We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Hearing Machines

While hearing in machines lags far behind vision in machines, the potential is great, and researchers are beginning to make impressive progress.

Technology Review has invited members of the 2006 TR35 to tell us about their hopes for research in 2007. Paris Smaragdis explains the importance of improving machine hearing. Smaragdis, a 2006 TR35, is a research scientist at MERL Research Lab, in Cambridge, MA.

Understanding how we perceive the world, and using that knowledge to make machines that can mimic us, has been an ongoing and exciting scientific quest. Vision has had the lion’s share of attention in the field. Our understanding of image structure and form is well developed. The development of machine learning and artificial intelligence (AI) has immensely benefited from–and has been immensely influenced by–vision problems. And we all understand why computers and ATM machines come equipped with cameras nowadays. The rest of the senses have not been investigated as much as vision has. Having a machine exhibit hearing is not something that people think about. Sure, computers can (sort of) recognize speech, but is that all hearing is good for? Surely we do more with our ears than just hear other people talk.

Our thinking is so concretely grounded in vision that hearing, as well as our other senses, has become a subconscious processes. But hearing is important for a lot of tasks. You can hear your baby cry from upstairs; you can hear the car you didn’t see approaching you in the pedestrian crosswalk; and you can hear that not-so-friendly dog growling behind your back. Machines can do their own set of valuable hearing tasks. They can listen for survivors in a collapsed building’s rubble; they can help soldiers locate who shot at them; they can listen for breathing problems in patients in intensive care; and they can try to filter out that annoying neighbor who loves to sing really loudly in the shower.

The technical challenges in computational audition are plenty. As in all fields of computational perception, there is a thrillingly large number of problems awaiting exploration that will keep technologists busy for a long time. However, until the idea of a hearing machine captures the public imagination, these problems will stay at the fringe of computer science. And being at the fringe comes with an extra burden to those of us working in the field.

Our knowledge of human hearing is relatively limited. We know how our ears work, but we mostly improvise in our descriptions as neural signals move deeper into the brain. The study of auditory psychology is not even close to where we want it to be. Machine learning, AI, and classical computer-science algorithms are deeply rooted in a visual way of thinking that does not extend naturally to reasoning about sound. Our own ability to describe sounds and the process of hearing is predominantly limited to vocabulary developed for music. These problems share the common cause that hearing (whether human or machine) is not something that has attracted adequate attention. Because of this, the process of creating a new technology in this field–from finding bibliographical references and abstracting to simpler problems, to actually explaining the point of it all in a business or technical meeting–is a fight against the unknown. Things are getting better, though: in the past few years an increasing number of researchers interested in computational perception have started showing interest in hearing (as well as in the senses taste and olfaction), and we have seen some amazing progress in our field as well as the slow emergence of relevant products in the mainstream.

So keep your mind and ears open. You might not see much of hearing machines today, but you’ll be hearing about them soon.

Keep up with the latest in artificial intelligence at EmTech Digital.
Don't be left behind.

March 25-26, 2019
San Francisco, CA

Register now
Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    Print + Digital Magazine (6 bi-monthly issues)

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

    Technology Review PDF magazine archive, including articles, images, and covers dating back to 1899

    10% Discount to MIT Technology Review events and MIT Press

    Ad-free website experience

You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.