Skip to Content
Uncategorized

Facebook’s Engineering Challenges Show How Fast It’s Growing

More than half the data the site currently stores was added in the last year.
June 23, 2010

Numbers on Facebook’s exponential growth often get thrown around, but it can be hard to comprehend what those numbers mean. The site is becoming the repository for huge quantities of information about the daily lives of its users, and it can be easier to understand how significant this is when listening to company executives discuss how Facebook struggles to manage all of this information. Bobby Johnson, director of engineering at Facebook, spoke this morning at the Usenix WebApps ‘10 conference in Boston, where he outlined how the company handles the technical and organizational problems created by its rapid rise.

The site’s 400 million users have an average of 130 friends each, and just this social graph data is tens of terabytes in size. What’s more, this data has to stay accessible at all times, since it’s used to respond to almost any action that a user takes. For example, when a user logs in, data about that person’s social connections is used to figure out what information to display in the user’s news feed (the first screen shown after login).

On top of keeping track of users’ connections to each other, Facebook has increasingly become the archive for users’ personal memories. The site has long been the largest photo-sharing site on the Web, and virtual photo albums have in many cases replaced the paper albums that used to sit on people’s shelves.

But while the accumulation of photos and videos may become an issue for the site in the long run, Johnson says that for now the main issue is dealing with new data. More than half of the data currently on the site was added this year, he says. Facebook plans never to delete old data, but even if they did, Johnson notes that it would do little to relieve the challenge of storing the flood of new data.

The company obviously takes the responsibility of storing all this data seriously–it routinely replicates information at least three times to ensure it is safe from hardware failure and bugs. It’s stunning, however, to contemplate how large a responsibility the company has for information belonging to a growing number of people around the world.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.