Skip to Content
Uncategorized

The Evolution of English Words and Phrases Since 1520

The digitisation of the world’s books reveals how the popularity of English words and phrases has evolved since the 16th century. And the Top 100 lists for each year are now free to browse online

The digitisation of the world’s books reveals how the popularity of English words and phrases has evolved since the 16th century. And the database is now freely browsable online

Last year, the Google Books team released some 4 per cent of all the books ever written as a corpus of digitised text, an event that has triggered something of a revolution in the study of trends in human thought. The corpus consists of 5 million books and over 500 billion words (361 billion in English) dating from the 1500s to the present day.

In a single stroke, this data gives researchers a way to examine a whole range of hitherto inaccessible phenomena. Since then a steady stream of new results has emerged on everything from the evolution of grammar and the adoption of technology to the pursuit of fame and the role of censorship.

Today, Matjaz Perc at the University of Maribor in Slovenia uses this data to examine the evolution of the most common English words and phrases since 1520. 

As expected, the results show various power laws at play. Power laws are thought to arise in social systems because of a phenomenon of self-organisation called preferential attachment.

This is the idea that in a network, a node with more connections is likely to attract more connections in future. That’s why it is also known the rich-get-richer effect or the Matthew effect (a biblical reference).

So it’s really no surprise that the popularity of words and phrases over time follows a similar law, given that the spread of language can be modelled by a network model.

What’s interesting about Perc’s work, however, is that he’s published the results on his website at http://www.matjazperc.com/ngrams/evolution.html.

Here you can see lists of the top 100 most popular 1, 2, 3, 4 and 5-word phrases for each year of data from 1520 until 2008.

It makes for fascinating browsing.  For example, the most popular 5-word phrase in 1520, the first year for which data is available, is “the pope and his followers”. In 2008, the last year for which data is available, it is “at the end of the”. 

Worth a look if you have a few minutes to spare.

Ref: arxiv.org/abs/1212.1709: Evolution Of The Most Common English Words And Phrases Over The Centuries

Keep Reading

Most Popular

conceptual illustration showing various women's faces being scanned
conceptual illustration showing various women's faces being scanned

A horrifying new AI app swaps women into porn videos with a click

Deepfake researchers have long feared the day this would arrive.

A view of clouds illuminated by sunlight
A view of clouds illuminated by sunlight

We can’t afford to stop solar geoengineering research

It is the wrong time to take this strategy for combating climate change off the table.

Death and Jeff Bezos
Death and Jeff Bezos

Meet Altos Labs, Silicon Valley’s latest wild bet on living forever

Funders of a deep-pocketed new "rejuvenation" startup are said to include Jeff Bezos and Yuri Milner.

new GPT3 is a good student
new GPT3 is a good student

The new version of GPT-3 is much better behaved (and should be less toxic)

OpenAI has trained its flagship language model to follow instructions, making it spit out less unwanted text—but there's still a way to go.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.