Hello,

We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

A View from Erica Naone

Facebook's Engineering Challenges Show How Fast It's Growing

More than half the data the site currently stores was added in the last year.

  • June 23, 2010

Numbers on Facebook’s exponential growth often get thrown around, but it can be hard to comprehend what those numbers mean. The site is becoming the repository for huge quantities of information about the daily lives of its users, and it can be easier to understand how significant this is when listening to company executives discuss how Facebook struggles to manage all of this information. Bobby Johnson, director of engineering at Facebook, spoke this morning at the Usenix WebApps ‘10 conference in Boston, where he outlined how the company handles the technical and organizational problems created by its rapid rise.

The site’s 400 million users have an average of 130 friends each, and just this social graph data is tens of terabytes in size. What’s more, this data has to stay accessible at all times, since it’s used to respond to almost any action that a user takes. For example, when a user logs in, data about that person’s social connections is used to figure out what information to display in the user’s news feed (the first screen shown after login).

On top of keeping track of users’ connections to each other, Facebook has increasingly become the archive for users’ personal memories. The site has long been the largest photo-sharing site on the Web, and virtual photo albums have in many cases replaced the paper albums that used to sit on people’s shelves.

But while the accumulation of photos and videos may become an issue for the site in the long run, Johnson says that for now the main issue is dealing with new data. More than half of the data currently on the site was added this year, he says. Facebook plans never to delete old data, but even if they did, Johnson notes that it would do little to relieve the challenge of storing the flood of new data.

The company obviously takes the responsibility of storing all this data seriously–it routinely replicates information at least three times to ensure it is safe from hardware failure and bugs. It’s stunning, however, to contemplate how large a responsibility the company has for information belonging to a growing number of people around the world.

Want more award-winning journalism? Subscribe to Insider Online Only.
  • Insider Online Only {! insider.prices.online !}*

    {! insider.display.menuOptionsLabel !}

    Unlimited online access including articles and video, plus The Download with the top tech stories delivered daily to your inbox.

    See details+

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

/3
You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.