Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo


Unsupported browser: Your browser does not meet modern web standards. See how it scores »

{ action.text }

Sampling Songs
Digital fingerprints make for easier searching

Results: Microsoft researchers have developed software that can automatically identify audio files–including streaming audio–by extracting and encoding short sections of them to form “fingerprints.” Christopher Burges and colleagues have developed two new applications for this audio-recognition technology: identifying duplicate files in a large collection of audio files and creating “thumbnails,” 15-second-long, recognizable snippets of each file. The software found duplicates in a database of more than 40,000 audio files with a 1.2 percent error rate. In another test involving 68 songs, a panel of users compared thumbnails made with the Microsoft software with snippets of the songs beginning 30 seconds in, and rated the Microsoft thumbnails more likely to contain the titles, choruses, or other distinctive features of the songs.

Why it Matters: Today’s digital-audio libraries are growing in size, and users must manually sort through them to find and remove duplicate files. Microsoft’s method of spotting duplicates could make for easier and faster consolidation of large song collections. Many online music purveyors also offer their customers previews of songs. Currently, those previews are created either manually–someone listens to the song to find a recognizable chorus, then makes the song snippet–or via software that samples only a predetermined segment of each song, which may not contain readily recognizable material. The new software can automatically find the defining part of a song when extracting a thumbnail, making the thumbnail a better indicator of the song’s identity.

Methods: The duplicate detector extracts a fingerprint for each file and puts it into a database. To compare two songs, it considers the location from which the first song’s fingerprint was extracted and looks for a matching fingerprint in the same vicinity in the second song. If it finds a match, it identifies the two as duplicates. After analyzing all the songs in the database, the detector presents the user with a list of duplicate songs.

The thumbnail generator compares fingerprints within a file. If it finds similar fingerprints at different points, it identifies them as the song’s chorus or some other characteristic feature. If fingerprint analysis doesn’t find a clear repeating feature, the software can analyze other aspects of the song, such as patterns of sound frequencies, to pick out a characteristic section. The software then extracts the 15 seconds of audio surrounding that section as the thumbnail.

Next Step: The researchers are working with Microsoft’s product teams to commercialize this technology. Potential applications might include software that cleans up music collections on home computers, freeing up disk space. Online music vendors could also use the thumbnail generator to create previews of the songs offered on their websites. – By Jean Thilmany

Source: Burges, C., et al. 2005. Using audio fingerprinting for duplicate detection and thumbnail generation. Paper presented at the IEEE Conference on Acoustics, Speech, and Signal Processing. March 18-23. Philadelphia, PA.

0 comments about this story. Start the discussion »

Tagged: Computing

Reprints and Permissions | Send feedback to the editor

From the Archives


Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me