Technology Review - Published By MIT
Advertisement

How Facebook Copes with 300 Million Users

VP of Engineering Mike Schroepfer reveals the tricks that keep the world's biggest social network going.

By Erica Naone

Tuesday, September 22, 2009

smaller text tool iconmedium text tool iconlarger text tool icon

Last week, the world's biggest social network, Facebook, announced that it had reached 300 million users and is making enough money to cover its costs.

Credit: Mozilla (portrait); Facebook (background)

The challenge of dealing with such a huge number of users has been highlighted by hiccups suffered by some other social-networking sites. Twitter was beleaguered with scaling problems for some time and became infamous for its "Fail Whale"--the image that appears when the microblogging site's services are unavailable.

In contrast, Facebook's efforts to scale have gone remarkably smoothly. The site handles about a billion chat messages each day and, at peak times, serves about 1.2 million photos every second.

Facebook vice president of engineering Mike Schroepfer will appear on Wednesday at Technology Review's EmTech@MIT conference in Cambridge, MA. He spoke with assistant editor Erica Naone about how the company has handled a constant flow of new users and new features.

Technology Review: What makes scaling a social network different from, say, scaling a news website?

Mike Schroepfer: Almost every view on the site is a logged-in, customized page view, and that's not true for most sites. So what you see is very different than what I see, and is also different than what your sister sees. This is true not just on the home page, but on everything you look at throughout the site. Your view of the site is modified by who you are and who's in your social graph, and it means we have to do a lot more computation to get these things done.

TR: What happens when I start taking actions on the site? It seems like that would make things even more complex.

MS: If you're a friend of mine and you become a fan of the Green Day page, for example, that's going to show up in my homepage, maybe in the highlights, maybe in the "stream." If it shows me that, it'll also say three of [my] other friends are fans. Just rendering that home page requires us to query this really rich interconnected dataset--we call it the graph--in real time and serve it up to the users in just a few seconds or hopefully under a second. We do that several billion times a day.

TR: How do you handle that? Most sites deal with having lots of users by caching--calculating a page once and storing it to show many times. It doesn't seem like that would work for you.

Story continues below

MS: Your best weapon in most computer science problems is caching. But if, like the Facebook home page, it's basically updating every minute or less than a minute, then pretty much every time I load it, it's a new page, or at least has new content. That kind of throws the whole caching idea out the window. Doing things in or near real time puts a lot of pressure on the system because the live-ness or freshness of the data requires you to query more in real time.

We've built a couple systems behind that. One of them is a custom in-memory database that keeps track of what's happening in your friends network and is able to return the core set of results very quickly, much more quickly than having to go and touch a database, for example. And then we have a lot of novel system architecture around how to shard and split out all of this data. There's too much data updated too fast to stick it in a big central database. That doesn't work. So we have to separate it out, split it out, to thousands of databases, and then be able to query those databases at high speed.

Comments

  • In memory data-grid
    Interesting post, reading the part were you mentioned developing custom in-memory database which is able to handle data partitioning and events seems to fit GigaSpaces model perfectly, did you try one of the available platforms out there before you did it in-house?
    Rate this comment: 12345

    michaelalon
    09/22/2009
    Posts:1
    Avg Rating:
    4/5
  • [no subject]
    It was less informative, I thought I would learn how really facebook is coping with the huge data, almost instantly
    Rate this comment: 12345

    iambahar
    09/22/2009
    Posts:1
    Avg Rating:
    4/5
    • Re:
      Yeah, it was pretty content-free. I supposes he's very constrained about what he can say, given that every word he says is probably going to be analyzed by FB's competitors for hints
      Rate this comment: 12345

      snedunuri
      10/03/2009
      Posts:30
      Avg Rating:
      4/5
  • Facebook has stopped growing
    Very interesting article, but I saw statistics at compete dot com. It shows that facebook.com did not have grown visits last 2 months. It appears to be decreasing traffic at social networks?
    Rate this comment: 12345

    asampau
    09/24/2009
    Posts:1
    Avg Rating:
    2/5
    • Re: Facebook has stopped growing
      Just recently, Facebook announced it had crossed 300 million users. According to their own statistics, they're still growing. I figure Facebook's growth is probably spread out internationally at this point--might that skew things for Compete?

      Aside from this individual case, measuring traffic is not as simple as it seems online. Check out Jason Pontin's article "But Who's Counting?" for a good discussion of issues with measurement online.
      Rate this comment: 12345

      Erica Naone
      09/25/2009
      Posts:43
      Avg Rating:
      4/5

Log In

Forgot your password?     Register »
Advertisement
Advertisement
Advertisement
Subscribe to Technology Review's daily e-mail update. Enter your e-mail address

TECHNOLOGY RESOURCES
Advertisement
MIT Massachusetts Institute of Technology © 2009 Technology Review. All Rights Reserved.