Getting Computers Into the Groove

Automated song analysis could lead to better recommendations for listeners.

Erica Naonearchive page

June 18, 2009

Computers have revolutionized the production, distribution and consumption of music, but when it comes to recommending a good tune, they’re still sorely lacking.

**Sounds alike:** A Facebook game called Herd It is being used by researchers at the University of California, San Diego, to classify music into different genres.

There are plenty of recommendation systems out there. iTunes offers Genius, which creates playlists and suggests music by comparing a collection to those of other users, and numerous music-oriented social-networking sites offer recommendations inspired by what a person’s friends are listening to. Now researchers at the University of California, San Diego (UCSD), are using machine learning, in combination with a Facebook game, to classify music based on automated analysis of the songs.

Gert Lanckriet, an assistant professor at UCSD, who is working on the project, says that the automated approach taken by his group’s music search and recommendation engine means that it could analyze huge quantities of songs, potentially giving users recommendations from a much larger library of music. The system can also make judgments about songs that it’s never come across before.

The UCSD researchers want their system to be able to tag songs so that users can search not just by artist or song title, but also by genre, instrument, and even descriptive words such as “romantic” or “spooky.” With this goal in mind, they’re collecting information about songs using a Facebook application called Herd It. The game awards users points when they tag songs in ways that agree with other users’ tags, collecting massive amounts of data in the process.

Once that data is collected, Lanckriet says, the researchers’ system groups songs according to the tags given to them by users, then searches for distinctive patterns in the music itself. It applies a statistical analysis to the waveform patterns that represent each song, looking for common features among songs grouped together by tag.

About 90 percent of the time, Lanckriet says, the system identifies patterns that are ordinarily hidden. For example, the patterns that identify a hip-hop song might include a typical hip-hop beat, but also elements that the listener wouldn’t recognize as a pattern within the song. “On average, these automatic tags predict other humans’ [tags] pretty much as accurately as a given human person can do,” Lanckriet says.

The researchers are currently working on collecting more data to train their system, and Lanckriet believes that the system has commercial potential. He envisions a system that could take an unfamiliar song–from an independent band, or even something recorded in a user’s garage–and then analyze it on the fly and suggest appropriate tags and similar music.

The popular Internet radio site Pandora performs a similar service, breaking down songs and analyzing their attributes. Founded in 2000, the site allows users to choose a song or artist, and then finds similar songs. Users can quickly fine-tune the results to create a highly personalized streaming radio station.

But Pandora’s technology is “100 percent manual,” according to Tim Westergren, the company’s chief strategy officer and founder. Through the Music Genome Project, a team of musicians evaluates songs, scoring them according to 400 different attributes. Once these attributes are identified, Westergren says, “it’s pretty straightforward math” to make recommendations to users. He says that Pandora is open to incorporating more automated approaches to analyzing songs but adds, “We haven’t yet found one that we think is really value added to what we’re doing.”

Other companies are also working on automatic analysis of music. The Echo Nest, a startup based in Somerville, MA, transforms the waveform patterns of songs according to simulations of how the human ear hears music. From there, the Echo Nest’s system applies filters that identify features of the song, such as tempo and pitch, according to company cofounder and CTO Tristan Jehan.

Once that’s done, the Echo Nest’s system combines this information with tagging information gleaned from blogs and other data posted on the Internet. It then applies machine-learning algorithms to identify features of songs that are commonly associated with specific tags, much as the UCSD researchers’ software does.

The difference, according to Jehan, is that instead of identifying complex patterns in the waveforms, the Echo Nest’s software focuses on features that would be recognized by a human listener.

Forrester Research analyst Sonal Gandhi, who follows the music industry, says that more automated methods of music search and recommendation could become important as on-demand music becomes more popular, and sites feel increased pressure to help users find new music.

Tim Crawford, a senior lecturer in computational musicology at Goldsmiths University of London, says that while analyzing music using computers is “a very interesting and promising area of research,” it will be hard to create a music search engine that’s both general and fully automatic. “Music similarity is such a personal and variable thing,” Crawford says. “Two heavy-metal tracks may seem highly similar to a classical-music expert like me, but entirely different to a heavy-metal enthusiast, who may in turn regard the music of Brahms and Tchaikovsky as very similar, which would be laughable to me.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.