The problems with Elon Musk’s plan to open-source the Twitter algorithm

It could introduce new security risks while doing little to boost transparency.

Chris Stokel-Walkerarchive page

April 27, 2022

Associated Press

Just hours after Twitter announced it was accepting Elon Musk’s buyout offer, the SpaceX CEO made his plans for the social network clear. In a press release, Musk outlined the sweeping changes he intended to make, including opening up the algorithms that determine what users see in their feed.

Musk’s ambition to open-source Twitter’s algorithms is driven by his long-standing concern about potential political censorship on the platform, but it’s unlikely that doing so will have the effect he desires. Instead, it may bring a number of unexpected problems, experts warn.

Musk might have a strong aversion to authority, but his desire for algorithmic transparency happens to chime with the wishes of politicians around the world. The idea has been a cornerstone of multiple governments’ attempts to fight back against Big Tech in recent years.

For example, Melanie Dawes, chief executive of Ofcom, which regulates social media in the UK, has said that social media platforms will have to explain how their code works. And the European Union’s recently passed Digital Services Act, agreed on April 23, will likewise compel platforms to offer more transparency. In the US, Democratic senators introduced proposals for an Algorithmic Accountability Act in February 2022. Their goal is to bring new transparency and oversight of the algorithms that govern our timelines and news feeds, and much else besides.

Allowing Twitter’s algorithm to be visible to others, and adaptable by competitors, theoretically means someone could just copy Twitter’s source code and release a rebranded version. Large parts of the internet run on open-source software—most famously OpenSSL, a security toolkit used by large parts of the web, which in 2014 suffered a major security breach.

There are even examples of open-source social networks already. Mastodon, a microblogging platform that was set up after concerns about the dominant position of Twitter, allows users to inspect its code, which is posted on the software repository GitHub.

But seeing the code behind an algorithm doesn’t necessarily tell you how it works, and it certainly doesn’t give the average person much insight into the business structures and processes that go into its creation.

“It’s a bit like trying to understand ancient creatures with genetic material alone,” says Jonathan Gray, a senior lecturer in critical infrastructure studies at King’s College London. “It tells us more than nothing, but it would be a stretch to say we know about how they live.”

There’s also not one single algorithm that controls Twitter. “Some of them will determine what people see on their timelines in terms of trends, or content, or suggested follows,” says Catherine Flick, who researches computing and social responsibility at De Montfort University in the UK. The algorithms people will primarily be interested in are the ones controlling what content appears in users’ timelines, but even that won’t be hugely useful without the training data.

“Most of the time when people talk about algorithmic accountability these days, we recognize that the algorithms themselves aren’t necessarily what we want to see—what we really want is information about how they were developed,” says Jennifer Cobbe, a postdoctoral research associate at the University of Cambridge. That’s in large part because of concerns that AI algorithms can perpetuate the human biases in data used to train them. Who develops algorithms, and what data they use, can make a meaningful difference to the results they spit out.

For Cobbe, the risks outweigh the potential benefits. The computer code doesn’t give us any insight into how algorithms were trained or tested, what factors or considerations went into them, or what sorts of things were prioritized in the process, so open-sourcing it may not make a meaningful difference to transparency at Twitter. Meanwhile, it could introduce some significant security risks.

Companies often publish impact assessments that probe and test their data protection systems to highlight weaknesses and flaws. When they’re discovered, they get fixed, but data is often redacted to prevent security risks. Open-sourcing Twitter’s algorithms would make the entire code base of the website accessible to all, potentially allowing bad actors to pore over the software and find vulnerabilities to exploit.

“I don’t believe for a moment that Elon Musk is looking at open-sourcing all the infrastructure and security side of Twitter,” says Eerke Boiten, a professor of cybersecurity at De Montfort University.

Open-sourcing Twitter’s algorithms could create yet another problem: it could help bad actors get better at gaming the system, which could make one of Musk’s other stated goals, “defeating all spam bots,” even harder.

“That’s not necessarily because individuals would be able to understand the intricacies of how the code of the algorithm works. But they’d be able to discern roughly how Twitter recommends posts on users’ timelines,” says Boiten. While Twitter users aren’t exactly in the dark about how the platform operates now, open-sourcing its algorithms could provide bad actors with new ammunition, he says.

There are other, more troubling unintended consequences. One of the key worries is that inevitable squabbles that will ensue as people try, amateurishly, to parse the algorithm. That could lead to yet more poisonous and fruitless debates.

“I worry that it’ll be made into a mountain where it’s really just a molehill,” says Flick. “There’s a lot of hype about the mysterious algorithm, but in reality it’s likely that bad behavior has social consequences that are reflected in the weightings of the tweets of those people.”

Open-sourcing the algorithm won’t fix any issues with bias, and taking action to fix biases that are raised will undoubtedly be viewed through a political, rather than technological, lens—at a time when we’re already massively politically polarized.

For example, a recent paper by Twitter researchers, highlighting how the algorithms more readily promote right-wing content than left-wing content, has already become a lightning rod. “It’s going to be a mess,” says Flick.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

Will Douglas Heavenarchive page

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

Casey Crownhartarchive page

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Will Douglas Heavenarchive page

How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

Cassandra Willyardarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

The problems with Elon Musk’s plan to open-source the Twitter algorithm

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

The problem with plug-in hybrids? Their drivers.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

How scientists traced a mysterious covid case back to six toilets

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

The problem with plug-in hybrids? Their drivers.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

How scientists traced a mysterious covid case back to six toilets

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review