Skip to Content

Audio Menus for iPods

Researchers are testing ways to let people listen to gadget menu options instead of looking at them.

Clicking through the menu on your iPod demands a significant amount of visual attention, which can be a hassle (while jogging) and even dangerous (while driving). But engineers at the University of Toronto and Microsoft Research are working on software that could make it possible to navigate the menus of gadgets that use circular touch pads, like the iPod, without looking at them–only audio cues would be used.

Hear this: Above are two potential menu displays for an auditory menu on a gadget, such as Apple’s iPod, that uses a circular track pad. When a person’s finger is in a particular position on the track pad, a recording of a human voice states the menu item. An item is selected when a person removes his or her finger.

The researchers have designed an auditory menu technique–called earPod–that provides audio feedback when a person drags his or her finger around the touch pad. Although it’s not ready to replace the expansive menus on real iPods, the results are encouraging, says Patrick Baudisch, a research scientist at Microsoft Research, in Seattle, who worked on the project. Within 30 minutes of beginning to use the technology, people can navigate two levels of earPod menus faster than traditional visual menus, and just as accurately.

“Requiring constant visual attention while using a PC is reasonable,” says Baudisch, “but if you’re using an iPod on the road, [constant visual attention] is unreasonable.” In addition to giving people back their eyes, he says, audio menus could help gadgets save battery life by not wasting energy on a screen, and they could add functions to the screen-free devices such as the iPod shuffle.

The idea of using audio menus isn’t new. Auditory interfaces can, after all, be found in touch-tone phone menus and in various assisted technologies for seeing-impaired users. But historically, handheld consumer gadgets haven’t widely used audio menus. There are a few reasons for this, says Bruce Walker, professor in the school of psychology and college of computing at Georgia Institute of Technology. One reason, he says, is that audio hardware and software have been resource intensive, requiring significant amounts of computation and energy. In addition, audio software has been difficult to program.

But computing power is becoming cheaper, and there is an increasing need to find different ways to interact with handheld devices, says Walker. Within the past 10 years, he says, the ubiquity of mobile devices with small displays “has made us all visually impaired.” Currently there are only a handful of researchers who are systematically looking at ways to make better audio interfaces for various devices, but Walker expects the ranks to grow in the coming years.

This first earPod prototype has a two-level menu hierarchy with 8 items per category, for a total of 64 items. To test how well people use the system, the researchers assigned to the first menu level a random assortment of categories: “clothing,” “fish,” “instrument,” “color,” and four others. The next level contained eight examples of these items. The iPod analogy would be found in the opening menu, which includes “music,” “extras,” “settings,” and then lower menus that include “playlists,” “artists,” and “albums,” for instance. The earPod approach could be extended to read off a limited number of names of artists and songs as well.

EarPod was designed specifically for gadgets with circular touch pads, says Baudisch. The circular touch pad is evenly divided into eight sectors: it’s cut like pieces of a pie, with each menu item associated with each piece. When a person touches the dial of an earPod-equipped gadget, the audio menu responds with a prerecorded human voice. If a person puts his or her finger at 12 o’clock on the touch pad, the voice might say “Color,” indicating that the finger is on the color sector. When the finger crosses one of these invisible sector lines, the user hears a clicking sound. As a finger moves, a new menu item is announced. To select an item and go to the next menu level, the user lifts his or her finger and hears a “camera-shutter” sound, which indicates that an item has been chosen.

Because the touch pad is divided into portions, says Baudisch, people can easily learn where menu items are and quickly jump to certain items without having to scroll through a list, as with an iPod. Another feature of earPod, he says, is that a user doesn’t need to wait until a menu item is read before moving on to another. When a finger moves to a new sector, the audio is interrupted and the new item is announced.

In the earPod usability study, conducted by Shengdong Zhao, a doctoral student at the University of Toronto, and project lead, the researchers found that people who had no experience using either an iPod or an earPod-equipped device used the devices with equal accuracy. EarPod was 92.1 percent accurate, while the visual system was 93.9 percent accurate, but the difference was not statistically significant. It took people longer to grow accustomed to earPod, but with experience, users’ performance on the audio menu became faster. After 30 minutes of training on both devices, subjects could navigate two levels of menu with earPod in 2.1 seconds as opposed to 2.5 seconds with the visual menu.

Georgia Tech’s Walker is impressed with the earPod approach and results. “My overall impression is that this is great … It was inevitable: trying to look at how to take an interface that is purely visual on the iPod and turn it into an interface that’s purely auditory, because, after all, the iPod’s an auditory device. Why should a person have to pull their player out while they’re jogging to look at it?”

Currently, however, earPod could not be a complete replacement for an iPod menu, Walker notes. One reason is that earPod doesn’t lend itself to menu flexibility. Once a person learns the position of the menu items, he or she might become frustrated if those positions need to change due to a software update or added playlist. In particular, the approach would not work well for menus such as mobile-phone address books, Walker says.

In addition, adds Baudisch, because the circular track pad is divided into sectors, there are a limited number of menu items that a person can access. If there are 8 sectors, each with 8 menu items, then there are only 64 total items accessible on the device, and this wouldn’t be good enough for iPods that hold hundreds of playlists and thousands of songs. However, Baudisch suspects that future prototypes will provide ways to get around the problem. He and his team are exploring how people respond to faster audio output (speeding up the recorded voice) and how people use audio and visual cues simultaneously. Developing an all-encompassing interface for eyes-free operations on auditory devices is still a future project, he says.

Keep Reading

Most Popular

This new data poisoning tool lets artists fight back against generative AI

The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models. 

Rogue superintelligence and merging with machines: Inside the mind of OpenAI’s chief scientist

An exclusive conversation with Ilya Sutskever on his fears for the future of AI and why they’ve made him change the focus of his life’s work.

The Biggest Questions: What is death?

New neuroscience is challenging our understanding of the dying process—bringing opportunities for the living.

Data analytics reveal real business value

Sophisticated analytics tools mine insights from data, optimizing operational processes across the enterprise.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.