The Chinese Solar Machine Layer by Layer Fire in the Library The Mystery Behind Anesthesia
Mozilla (portrait); Facebook (background)
VP of Engineering Mike Schroepfer reveals the tricks that keep the world's biggest social network going.
Last week, the world's biggest social network, Facebook, announced that it had reached 300 million users and is making enough money to cover its costs.
The challenge of dealing with such a huge number of users has been highlighted by hiccups suffered by some other social-networking sites. Twitter was beleaguered with scaling problems for some time and became infamous for its "Fail Whale"--the image that appears when the microblogging site's services are unavailable.
In contrast, Facebook's efforts to scale have gone remarkably smoothly. The site handles about a billion chat messages each day and, at peak times, serves about 1.2 million photos every second.
Facebook vice president of engineering Mike Schroepfer will appear on Wednesday at Technology Review's EmTech@MIT conference in Cambridge, MA. He spoke with assistant editor Erica Naone about how the company has handled a constant flow of new users and new features.
Technology Review: What makes scaling a social network different from, say, scaling a news website?
Mike Schroepfer: Almost every view on the site is a logged-in, customized page view, and that's not true for most sites. So what you see is very different than what I see, and is also different than what your sister sees. This is true not just on the home page, but on everything you look at throughout the site. Your view of the site is modified by who you are and who's in your social graph, and it means we have to do a lot more computation to get these things done.
TR: What happens when I start taking actions on the site? It seems like that would make things even more complex.
MS: If you're a friend of mine and you become a fan of the Green Day page, for example, that's going to show up in my homepage, maybe in the highlights, maybe in the "stream." If it shows me that, it'll also say three of [my] other friends are fans. Just rendering that home page requires us to query this really rich interconnected dataset--we call it the graph--in real time and serve it up to the users in just a few seconds or hopefully under a second. We do that several billion times a day.
TR: How do you handle that? Most sites deal with having lots of users by caching--calculating a page once and storing it to show many times. It doesn't seem like that would work for you.
MS: Your best weapon in most computer science problems is caching. But if, like the Facebook home page, it's basically updating every minute or less than a minute, then pretty much every time I load it, it's a new page, or at least has new content. That kind of throws the whole caching idea out the window. Doing things in or near real time puts a lot of pressure on the system because the live-ness or freshness of the data requires you to query more in real time.
We've built a couple systems behind that. One of them is a custom in-memory database that keeps track of what's happening in your friends network and is able to return the core set of results very quickly, much more quickly than having to go and touch a database, for example. And then we have a lot of novel system architecture around how to shard and split out all of this data. There's too much data updated too fast to stick it in a big central database. That doesn't work. So we have to separate it out, split it out, to thousands of databases, and then be able to query those databases at high speed.
It was less informative, I thought I would learn how really facebook is coping with the huge data, almost instantly
Very interesting article, but I saw statistics at compete dot com. It shows that facebook.com did not have grown visits last 2 months. It appears to be decreasing traffic at social networks?
Re: Facebook has stopped growing
Just recently, Facebook announced it had crossed 300 million users. According to their own statistics, they're still growing. I figure Facebook's growth is probably spread out internationally at this point--might that skew things for Compete?
Aside from this individual case, measuring traffic is not as simple as it seems online. Check out Jason Pontin's article "But Who's Counting?" for a good discussion of issues with measurement online.
Manufacturing in the United States is in trouble. That's bad news not just for the country's economy but for the future of innovation.
michaelalon
1 Comment
In memory data-grid
Interesting post, reading the part were you mentioned developing custom in-memory database which is able to handle data partitioning and events seems to fit GigaSpaces model perfectly, did you try one of the available platforms out there before you did it in-house?
Reply