New work from students at the University of Hong Kong describes a novel use of neural networks, collections of artificial neurons or nodes that can be trained to accomplish a wide variety of tasks, previously used only in image recognition. The students used a convolutional network to “learn” features, such as tempo and harmony, from a database of songs that spread across 10 genres. The result was a set of trained neural networks that could correctly identify the genre of a song, which in computer science is considered a very hard problem, with greater than 87 percent accuracy. In March the group won an award for best paper at the International Multiconference of Engineers and Computer Scientists.
What made this feat possible was the depth of the student’s convolutional neural network. Conventional “kernel machine” neural networks are, as Yoshua Bengio of the University of Montreal has put it, shallow. These networks have too few layers of nodes–analogous to the layers of neurons in your cerebral cortex–to extract useful amounts of information from complex natural patterns.
In their experiments, the students, led by professor Tom Li, discovered that the optimal number of layers for musical genre recognition was three convolutional (or “thinking”) layers, with the first layer taking in the raw input data and the third layer outputting the genre data.
In each layer (pictured above) a single node, or neuron, “hears” only a tiny portion of the song, about 23 milliseconds. Each node overlaps 50 percent with its neighbors, however, and so in total the many nodes in the neural network hear a little more than two seconds of the song.
While a human might be hard-pressed to identify the genre of a track in so short a time, this particular algorithm does so easily when applied to songs from the standard library used for testing automated genre recognition. However, it fell flat in subsequent tests in which the students exposed it to music outside of the library on which it was trained.
They attribute the failure of their algorithm to work “in the wild” to an insufficiently large training library on which the network learned in the first place. Because their algorithm was able to chew through 240 songs in just two hours, the Hong Kong students say it has the potential to be quite scalable.
Intriguingly, the convoluted neural network on which this work is based was originally inspired by an examination of the cat visual cortex. Cats, being mammals, have visual cortexes not unlike our own. Experiments done in a related species, the ferret, have shown that, in the inverse of what was done in this paper where a visual neural network was applied to a problem in hearing, it’s possible to re-wire a mammalian brain to see with its auditory cortex.
If convoluted neural networks are as flexible as the perceptual systems of mammals on which they are based, why aren’t they being applied to all sorts of other problems of perception in AI?
Forget dating apps: Here’s how the net’s newest matchmakers help you find love
Fed up with apps, people looking for romance are finding inspiration on Twitter, TikTok—and even email newsletters.
How AI is reinventing what computers are
Three key ways artificial intelligence is changing what it means to compute.
These weird virtual creatures evolve their bodies to solve problems
They show how intelligence and body plans are closely linked—and could unlock AI for robots.
We reviewed three at-home covid tests. The results were mixed.
Over-the-counter coronavirus tests are finally available in the US. Some are more accurate and easier to use than others.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.