Skip to Content
Smart cities

Why the “y’all” line divides the US north from south, not east from west

Variations in spoken English seem to spread more easily across the US than up or down. Data mining points to an explanation.

One of the curious features of language is that it varies from one place to another. Even among speakers of the same language, regional variations are common, and the divide between these regions can be surprisingly sharp.

That raises the interesting question of how these linguistic variations occur and why.

Today we get an insight thanks to the work of James Burridge at the University of Portsmouth in the UK and a few colleagues. These guys have studied the regional variations in spoken English across the United States. Their results suggest that linguistic forms spread to a greater extent in the east-west direction than north-south. And they raise the intriguing suggestion that a powerful principle of self-organization may be responsible.

The first person to map the way language varies with geography was a German linguist, George Wenk, who asked 50,000 schoolmasters across Germany to transcribe a list of sentences in the local dialect. Although overwhelmed by the data he compiled, Wenk mapped language variations across some parts of Germany for the first time, and his work became the foundation of many future studies across the world.

One of these is the Cambridge Online Survey of World Englishes, which asks 31 different questions to people in different parts of the world.

For example, question 5 is: What word(s) do you use in casual speech to address a group of two or more people?

This question has over 50,000 responses from the eastern part of the United States, where the most popular answers are you guys (35%) and y’all (15%). 

“Y'all” (blue) versus “you guys” (yellow) usage in the eastern US

Other questions generate a wider range of responses. Question 8—What do you call the gooey or dry matter that collects in the corners of your eyes, especially while you are sleeping?—has over 800 distinct families of responses. The most common: (eye) boogers, sleep, (eye) gunk, and (eye) crusties.

The question that Burridge and co investigate is how these responses are clustered in space. They used a standard algorithm to see how the responses change with location. This plots the responses on a map of the US, with similar answers denoted by the same color.

The results show clear geographical  boundaries between different linguistic uses. For example, the term “you guys” is used most often in the northern parts of the US, while “y’all” is used more in the south.

The team investigate this in more detail by calculating how areas are linked to each other linguistically. So for each population center, they find the four other centers that have the most similar linguistic characteristics. They then connect these centers on a map of the US to visualize the links.

The direction of these links provides an important clue. By plotting the distribution of directions, the team show that links are more common between places that are at similar latitudes. So linguistic similarities are clearer along an east-west axis than a north-south axis. Or in other words, there is a linguistic boundary between the north and south of the US.

Burridge and co say one reason for this could be historical—the colonization of the US took place largely in an east-to-west direction. “It is possible that this anisotropy is a historical artefact of the west-moving colonisation of the continent, leading to disproportionately strong east-west cultural identification,” they say.

But another possibility is that transport links are stronger in this direction, making it easier for linguistic variations to spread. 

But which of these effects played the bigger role?

To tease this question apart, Burridge and co turn to a powerful but poorly understood phenomenon in the physics of complex systems—self-organization. Physicists have long noticed that complex systems can self-organize in a way that creates boundaries. For example, in low-temperature magnets, magnetic domains often form into stripes. Curiously, the shape of the magnet determines the number and direction of the stripes. In a rectangular system, these stripes form across the shortest width, and in greater numbers when the aspect ratio is higher. In other words, stripes form more readily across longer and narrower magnets.

Burridge and co investigate the possibility that a similar self-organizing effect is at work with language. In this case, the geographical shape of the US is less important than the ease of travel. So better east-west transportation links are analogous to shrinking the width of the US in that direction.

The team say that in this case, self-organizing makes a north-south divide almost inevitable, just as is observed. Without this shrinking, the observed north-south divide is just one of various different outcomes, although it is still possible.

“Our work therefore takes a step towards answering the question of whether the observed north-south linguistic divide in the USA is merely a consequence of population distribution and geography,” conclude the team.

That falls well short of a proof that enhanced communication links play the crucial role in stimulating self-organization in linguistic variation.

But it is certainly a plausible idea. Self-organizing behavior is so ubiquitous elsewhere in the universe that it would be hard to explain why it doesn’t play a role here.

Ref:  arxiv.org/abs/1811.08788 : Statistical Physics of Language Maps in the USA

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.