“You should never trust this number,” said Martin Farach-Colton, a professor of computer science at Rutgers University, speaking a little more than a year ago. “People make a big deal about it, and it’s not true.”Farach-Colton was giving a public lecture about his two-year sabbatical working at Google. The number that he was disparaging was in the middle of his PowerPoint slide:
- 150 million queries/day
The next slide had a few more numbers:
- 1,000 queries/sec (peak)
- 10,000+ servers
- More than 4 tera-ops/sec at daily peak
- Index: 3 billion Web pages
- 4 billion total docs
- 4+ petabytes disk storage
A few people in the audience started to giggle: the Google figures didn’t add up.
I started running the numbers myself. Let’s see: “4 tera-ops/sec” means 4,000 billion operations per second; a top-of-the-line server can do perhaps two billion operations per second, so that translates to perhaps 2,000 servers-not 10,000. Four petabytes is 4x1015 bytes of storage; spread that over 10,000 servers and you’d have 400 gigabytes per server, which again seems wrong, since Farach-Colton had previously said that Google puts two 80-gigabyte hard drives into each server.
And then there is that issue of 150 million queries per day. If the system is handling a peak load of 1,000 queries per second, that translates to a peak rate of 86.4 million queries per day-or perhaps 40 million queries per day if you assume that the system spends only half its time at peak capacity. No matter how you crank the math, Google’s statistics are not self-consistent.
“These numbers are all crazily low,” Farach-Colton continued. “Google always reports much, much lower numbers than are true.”
Whenever somebody from Google puts together a new presentation, he explained, the PR department vets the talk and hacks down the numbers. Originally, he said, the slide with the numbers said that 1,000 queries/sec was the “minimum” rate, not the peak. “We have 10,000-plus servers. That’s plus a lot.”
Just as Google’s search engine comes back instantly and seemingly effortlessly with a response to any query that you throw it, hiding the true difficulty of the task from users, the company also wants its competitors kept in the dark about the difficulty of the problem. After all, if Google publicized how many pages it has indexed and how many computers it has in its data centers around the world, search competitors like Yahoo!, Teoma, and Mooter would know how much capital they had to raise in order to have a hope of displacing the king at the top of the hill.
Google has at times had a hard time keeping its story straight. When vice president of engineering Urs Hoelzle gave a talk about Google’s Linux clusters at the University of Washington in November of 2002, he repeated that figure of 1,000 queries per second-but he said that the measure was made at 2:00 a.m. on December 25, 2001. His point, obvious to everybody in the room, is that even by November 2002, Google was doing a lot more than 1,000 queries per second-just how many more, though, was anybody’s guess.
The facts may be seeping out. Last Thanksgiving, the New York Times reported that Google had crossed the 100,000-server mark. If true, that means Google is operating perhaps the largest grid of computers on the planet. “The simple fact that they can build and operate data centers of that size is astounding,” says Peter Christy, co-founder of the NetsEdge Research Group, a market research and strategy firm in Silicon Valley. Christy, who has worked in the industry for more than 30 years, is astounded by the scale of Google’s systems and the company’s competence in operating them. “I don’t think that there is anyone close.”
It’s this ability to build and operate incredibly dense clusters that is as much as anything else the secret of Google’s success. And the reason, explains Marissa Mayer, the company’s director of consumer Web products, has to do with the way that Google started at Stanford.
Instead of getting a few fast computers and running them to the max, Mayer explained at a recruiting event at MIT, founders Sergey Brin and Larry Page had to make do with hand-me-downs from Stanford’s computer science department. They would go to the loading dock to see who was getting new computers, then ask if they could have the old, obsolete machines that the new ones were replacing. Thus, from the very beginning, Brin and Page were forced to develop distributed algorithms that ran on a network of not-very-reliable machines.
Today this philosophy is built into the company’s DNA. Google buys the cheapest computers that it can find and crams them in racks and racks in its six (or more) data centers. “PCs are reasonably reliable, but if you have a thousand of them, one is going to fail every day,” said Hoelzle. “So if you can just buy 10 percent extra, it’s still cheaper than buying a more reliable machine.”
Working at Google, an engineer told me recently, is the nearest you can get to having an unlimited amount of computing power at your disposal.