Edit
Emerging Technology from the arXiv

A View from Emerging Technology from the arXiv

Machine-Learning Algorithm Mines Rap Lyrics, Then Writes Its Own

An automated rap-generating algorithm pushes the boundaries of machine creativity, say computer scientists.

  • May 20, 2015

The ancient skill of creating and performing spoken rhyme is thriving today because of the inexorable rise in the popularity of rapping. This art form is distinct from ordinary spoken poetry because it is performed to a beat, often with background music.

And the performers have excelled. Adam Bradley, a professor of English at the University of Colorado has described it in glowing terms. Rapping, he says, crafts “intricate structures of sound and rhyme, creating some of the most scrupulously formal poetry composed today.”

The highly structured nature of rap makes it particularly amenable to computer analysis. And that raises an interesting question: if computers can analyze rap lyrics, can they also generate them?

Today, we get an affirmative answer thanks to the work of Eric Malmi at the University of Aalto in Finland and few pals. These guys have trained a machine-learning algorithm to recognize the salient features of a few lines of rap and then choose another line that rhymes in the same way on the same topic. The result is an algorithm that produces rap lyrics that rival human-generated ones for their complexity of rhyme.

Various forms of rhyme crop up in rap but the most common, and the one that helps distinguish it from other forms of poetry, is called assonance rhyme. This is the repetition of similar vowel sounds such as in the words “crazy” and “baby” which share two similar vowel sounds. (That’s different from consonance, which uses similar consonant sounds, such as in “pitter patter” and different from perfect rhyme where words share the same ending sound such as “slang” and “gang.”)

Because of its prevalence in rap, Malmi and co focus exclusively on the way assonance appears in rap lyrics. But they also assume a highly structured form of verse consisting of 16 lines, each of which equals one musical bar and so must be made up of four beats. The lines typically, but not necessarily, rhyme at the end.

To train their machine learning algorithm, they begin with a database of over 10,000 songs from more than 100 rap artists.

Spotting assonant rhymes is not hard. The words must first be converted into phonemes (assuming a typical American-English pronunciation). Finding rhymes is then simply a question of scanning the phonemes looking for similar vowels sounds while ignoring consonant sounds and spaces.

That immediately suggests a way of ranking the complexity of lyrics. Malmi and co look for sequences of matching vowel sounds in the previous two lines or so. They then define “rhyming density” as the average of all the longest sequences in the lyrics.

This measure has allowed them to rank all the rap artists in their database according to their rhyming density. The three rappers who head the list are Inspectah Deck, Rakim, and Redrama. Rakim, in particular, is known for his multisyllabic rhymes.

Curiously, the rapper Eminem, who is also famous for his multisyllabic rhymes, comes surprisingly low on the list. That’s probably because Eminem often achieves his rhymes by “bending” words, a trick that this technique does not allow for.

Nevertheless, this metric is an interesting measure of a rapper’s rhyming skill and one that the team can use to compare their automated raps with human generated ones.

They next set their machine learning algorithm, called DeepBeat, a task. Having mined the database, its goal is to analyze a sequence of lines from a rap lyric and then choose the next line from a list that contains randomly chosen lines from other songs as well as the actual line.

This it can do surprisingly well. “An 82% accuracy was achieved for separating the true next line from a randomly chosen line,” say Malmi and co.

That’s not bad and immediately suggests a way to generate lyrics automatically. Malmi and co start with a line from one rap lyric and ask the computer to search through the database for another line on the same topic that best rhymes. It then repeats this process for the next line and so on.

The results are something of an eye opener. Here is one DeepBeat generated on the topic of love

For a chance at romance I would love to enhance
But everything I love has turned to a tedious task
One day we gonna have to leave our love in the past
I love my fans but no one ever puts a grasp
I love you momma I love my momma – I love you momma
And I would love to have a thing like you on my team you take care
I love it when it’s sunny Sonny girl you could be my Cher
I’m in a love affair I can’t share it ain’t fair
Haha I’m just playin’ ladies you know I love you.
I know my love is true and I know you love me too
Girl I’m down for whatever cause my love is true
This one goes to my man old dirty one love we be swigging brew
My brother I love you Be encouraged man And just know
When you done let me know cause my love make you be like WHOA
If I can’t do it for the love then do it I won’t
All I know is I love you too much to walk away though

That’s impressive. Each of these lines is taken from another rap song—for example the final line is from Eminem’s

What’s more, this and other raps generated by DeepBeat have a rhyming density significantly higher than any human rapper. “DeepBeat outperforms the top human rappers by 21% in terms of length and frequency of the rhymes in the produced lyrics,” they point out.

Where DeepBeat falls down is in the coherence of its storytelling, which is unsurprising given that its focus is largely on rhyme. That’s clearly work for the future.

Ref:arxiv.org/abs/1505.04771: DopeLearning: A Computational Approach to Rap Lyrics Generation

Uh oh–you've read all five of your free articles for this month.

Insider Online Only

$19.95/yr US PRICE

Communications

You've read of free articles this month.