Physics and Beyond
Once the concept was introduced, grid computing suddenly seemed to fill a need of scientists all over the world. In Geneva, for example, the high-energy physics lab of the European Organization for Nuclear Research (known by the acronym CERN) was already planning its next-generation particle accelerator, the Large Hadron Collider-an effort promising to generate an overwhelming amount of data. “We estimated that when the collider started running in 2006 it would produce eight to 10 petabytes of particle collision data per year,” says Fabrizio Gagliardi, director of CERN’s annual seminar on computing for physicists. That’s petabytes-millions of gigabytes.Portions of this immense data load would have to be distributed to the institutions all over the world that participate in CERN experiments. And since the most interesting physics tends to be found in the rarest events, Gagliardi explains, scientists “would be processing every bit of that data in multiple ways”-looking for hints of the theoretically predicted but elusive Higgs boson, say, or particles that possess the mysterious quality known as supersymmetry. In short, the collider portended an enormous data management problem for which existing computer systems seemed inadequate. “We defined a computational architecture for what we would need,” Gagliardi recalls. “Then we went shopping for a system of tools to build it-and discovered that the computer scientists had already come up with solutions.”
Several solutions, actually. At the University of Virginia, computer scientist Andrew Grimshaw had been working since 1993 on an attractive and well-thought-out set of grid computing protocols known as Legion. (Legion is now being marketed by Avaki of Cambridge, MA, which Grimshaw founded.) But Globus had the advantage of being “open”: in the interests of getting it adopted as widely and as rapidly as possible, Foster and Kesselman had decided to emulate the developers of the now famous Linux operating system and make the Globus source code available to any users who wanted it, so that they could study it, experiment with it and suggest improvements.
The result was that Globus became the foundation for the European DataGrid, a three-year demonstration and software development project that launched on January 1, 2001, with a commitment of 13.5 million euros (roughly $12 million) from the European Union. By the beginning of 2002, the DataGrid had deployed more than 100 computers-20 at CERN, the others at sites around the continent, according to Gagliardi, now the DataGrid’s director. The project has also expanded beyond particle physics to include two other scientific disciplines that face similarly daunting data-crunching and processing challenges: earth observation and biology.
Meanwhile, grid computing has been finding an even warmer welcome among scientists in the United States-with Globus again being the choice of virtually every large project. One of the first to get going was the Grid Physics Network. Organized by Foster and University of Florida physicist Paul Avery, this effort was launched in September 2000 with $11.9 million from the National Science Foundation. It focuses on the vast amount of physical data generated by four different sources: two specialized particle detectors housed at the Large Hadron Collider; the Laser Interferometer Gravitational Wave Observatory, a Caltech-MIT collaboration that will detect gravitational waves from pulsars and the like; and the Sloan Digital Sky Survey, an international effort to map the faintest possible stars and galaxies-more than 100 million celestial bodies in all. More recent initiatives include the NSF’s Network for Earthquake Engineering Simulation grid, an effort to integrate observations and computer simulations now scattered among some 20 different labs, with the goal of producing more effective designs for earthquake-resistant structures.
And now, of course, there’s the TeraGrid-the “put-your-money-where-your-mouth-is grid,” as Argonne’s Charles Catlett calls it. “We’ve been talking for years,” says Catlett, the project’s executive director. But for the TeraGrid to achieve what it promises, the high-powered microcomputer clusters located at its four physical sites will have to be tied together by a dedicated network running at 40 gigabits per second, which will be right on the ragged edge of the state of the art. “This will show us a lot about how the software really works in a production environment,” says Catlett. He’s talking about the Globus software, the Internet protocols, the Linux operating system-all of it.
On the technical side, Catlett says, one of the big challenges is making sure that Globus can successfully scale up. It is critical, he notes, to make sure that Globus’s services and protocols “can deal with hundreds or thousands of times more devices than they handle now.” “Obviously,” agrees Foster, “there is lots that still needs to be done.”
Then there’s the business side. Here, grid computing runs into the same question that sank so many of the overoptimistic dot coms: how will money be made from this technology? “If computing is a utility,” Foster says, “who’s going to pay for the infrastructure? What kind of services are people prepared to pay for?” In particular, where is the killer app, the must-have application that will drive the growth of grid computing the way the spreadsheet did personal computing? Most current grid projects have barely moved past the if-we-build-it-they-will-come stage.
On the other hand, says Foster, “we do have some ideas.” One notable example is the Access Grid, an Argonne-developed system-based, like so much else in grid computing, on Globus-that supports large-scale, multisite meetings over the Internet, as well as lectures and collaborative work sessions. It already links more than 80 academic and industry sites around the globe. Furthermore, says Foster, as more and more big scientific projects like the TeraGrid and the DataGrid come on line, there’s every reason to think that they will serve as laboratories for new grid applications that will then make their way into the commercial world, with huge impact. After all, the Internet’s killer app, the World Wide Web, didn’t come out of a corporate lab. It came out of CERN.