App Listens for Danger When You’re Not Paying Attention
An app called Audio Aware lets the hard of hearing and the distracted know when danger approaches.
As our devices become more engaging, the risks of being distracted increase.
A startup is developing machine-learning technology that mimics the way the ear works, which it believes will make it easier for smartphones and wearable devices to constantly listen for sounds of danger.
One Llama will show some of its capabilities in an app called Audio Aware, which is meant to alert hard-of-hearing smartphone users and “distracted walkers” (an issue previously explored in “Safe Texting While Walking? Soon There May be an App for That”). The app, planned for release in March, will run in the background on an Android smartphone, detecting sounds like screeching tires and wailing sirens and alerting you to them by interrupting the music you’re listening to, for instance. The app will arrive with knowledge of a number of perilous sounds, and users will be able to add their own sounds to the app and share them with other people.
One Llama hopes Audio Aware will pique interest among makers of wearable gadgets, who could bake the technology into smart glasses, smart watches, and fitness trackers. In those devices, Audio Aware could do more than just be alert to dangers: it could monitor health conditions, workouts, or even locations by paying attention to the sounds you make and the noises around you. Bird watchers might want to use it to home in on the differences between, say, a male chipping sparrow and a dark-eyed junco.
The crux of One Llama’s technology is what the company calls its “artificial ear.” When sound enters your ear, it travels through the spiral-shaped cochlea, which is lined with tiny hair cells that vibrate like tuning forks when hit by certain frequencies. One Llama’s artificial ear is a software version of this—essentially, a bank of digital tuning forks that measure sounds. It’s based on work that cofounder David Tcheng and others conducted at the University of Illinois, where he is a research scientist.
The company claims this method can be speedier and more flexible than other common methods for analyzing the different frequencies of the vibrations that we hear as sounds.
In the case of Audio Aware, it will work by listening through your smartphone’s microphone, Tcheng says, constantly comparing what it hears to stored templates of alert sounds it needs to recognize. When a sufficient match, such as a car horn, is detected, it will cancel any audio you’re hearing and pipe in an amplified version of the sound it’s picking up, or perhaps a cartoon-like version of that sound that is easier to recognize.
Audio Aware will be able to work without access to a wireless network, but it will have to stream audio to a remote server when it learns new sounds—in a new country, for example, where sirens sound different than at home.
Can the app do all that it needs to do in time to warn you before you step in front of an oncoming car? Tcheng acknowledges that challenge but believes the software extracts audio features quickly enough to actually help users in real time. But One Llama’s technology is not foolproof. Tcheng gave me a demonstration of how One Llama’s technology could pick out several sounds—including breaking glass, a ringing doorbell, and a honking horn—over the din of a radio playing and cat meowing in his home. Although the software correctly identified sounds such as glass breaking, it also incorrectly identified a doorbell ringing. Over time, presumably, the system would learn the difference.
Richard Stern, a professor at Carnegie Mellon University who researches speech recognition, says sound-processing methods based on the workings of the cochlea have become increasingly common in part because computer processing power has become so much cheaper over time.
Paying attention to how the auditory system processes signals can be helpful for recognizing sounds in noisy environments in particular, he says. But the complexity of sounds we encounter every day means sound-recognition systems are constantly trying to home in on one signal among many, and it’s virtually impossible to predict in advance how these signals will combine. Humans are still far ahead of computers in that respect.