Computing as Utility
That’s a tall order. But it certainly describes the hope at IBM, which is the prime contractor for the TeraGrid, as well as for similar national grids in Europe. David Turek, vice president of emerging technologies for IBM’s server group, compares grid computing to the familiar grid of electrical power: “To use a hair dryer, you just plug it into a wall socket,” he says. “You don’t have to worry about how the turbine is designed up in Niagara Falls, or the physics of power transmission.” That’s exactly how Turek wants people to think about computing power. “In our vision of the future, if you’re a customer who occasionally needs 10 teraflops, for example, don’t buy a machine that’s underutilized most of the time; buy it from the grid. So grid computing will play into our vision of computing as a utility.”While companies like IBM would build the large-scale grids, Turek says that many users will want to set up grids of their own. “You might see 10 to 20 departments coming together to create a campuswide or companywide grid, each contributing some of the computer power they control,” he says. In another scenario, several independent companies, such as defense contractors, might do much the same thing to create “virtual organizations”-ad hoc grids that would allow them to use one another’s proprietary data and software to prepare, say, a proposal for a new military aircraft. “That’s why we’re not going to espouse the grid as something that can be done only with IBM technology,” Turek explains. After all, he says, “if you get five companies wanting to come together on a grid, the likelihood of all five having the same servers is pretty slim.”
And that, Turek adds, is the beauty of the Globus Toolkit: a set of open-source software tools that is fast emerging as the de facto standard for grid computing, in much the same way that the hypertext transfer protocol, or HTTP, is the standard for linking documents on the Web. Indeed, the growing acceptance of Globus is largely responsible for today’s wave of grid computing excitement.
“The idea is to let the network provide the basic mechanisms for moving data around, while Globus provides mechanisms for resource sharing,” explains Carl Kesselman of the University of Southern California’s Information Sciences Institute. Kesselman has been developing the Globus Toolkit over the past five years in collaboration with Ian Foster-a University of Chicago computer scientist who heads Argonne’s distributed-systems laboratory.
The mechanisms that Globus provides are as essential to the computing grid’s operation as stoplights are to city traffic. One set of Globus software tools, for example, automatically roots out where on the grid a required database or program can be found. Other tools allow one-time login, so that the user isn’t constantly being asked for passwords for site after site after site. Still others divide a computational job into multiple subtasks and parcel them out among the various systems on the grid. And most important, Globus provides tools to implement security-assuring, for instance, that an outside program trying to interact with your machine is serving a legitimate purpose and hasn’t been sent by some malicious hacker.
Of course, none of this is entirely new: “It’s worth remembering,” notes Kesselman, “that ARPAnet [the military-built ancestor of the Internet] was built in the 1960s to give users on one campus shared access to resources on a different campus.” Likewise, he points out, methods for breaking computational jobs into smaller pieces for multiple machines were a perennial research topic throughout the 1970s and 1980s.
But it was only in the 1990s, Kesselman says, that the rapidly increasing power of computers and networks brought this trend, known as distributed computing, out of the laboratories. One result was a flurry of experiments in what is now known as “peer-to-peer” computing, all devoted in one way or another to harnessing the computing power and storage capacity of idle desktop machines. Among the best known of these efforts are Napster, the MP3 music file-sharing system, and SETI@home, in which radio telescope data from the search-for-extraterrestrial-intelligence project are distributed to PCs across the Internet.
At the same time, however, the high-performance-computer community began a series of less publicized but much more ambitious experiments in “metacomputing.” The idea was to make many distributed computers function like one giant computer. The metamachine’s keyboard and display would be sitting on someone’s desktop, as usual. But its central processor might actually be a supercomputer in Illinois, say, while its graphics processor might be an immersive-virtual-reality facility in California. It worked, says Kesselman-the only problem being that experimenters had to reinvent the wheel every time. “There was still no standard software for distributed computing,” he says, “no infrastructure to support it.”
The technology’s watershed event came in 1995, at a supercomputing conference sponsored by the Institute of Electrical and Electronics Engineers and the Association for Computing Machinery. There, 11 separate high-speed networks were briefly connected into one giant metacomputer in a demonstration called I-Way. Attendees thronging the San Diego Convention Center could play with an interactive model of the Chesapeake Bay ecosystem, or a high-resolution simulation of colliding spiral galaxies-some 60 applications in all. Foster, who led the team that created some of the system’s underlying software, was especially impressed by I-Way’s potential use in collaborative design. In one demonstration, he recalls, researchers at Argonne teamed up with those at an industrial group, Nalco Fuel Tech, to make a virtual-reality simulation for designing incinerators. “Users at different sites could fly together through the incinerator, place injectors in it at various points and jointly study the effect on its output,” he recalls.
The demonstration had its intended effect. “I-Way convinced people that grid computing had great potential,” says Foster. One important payoff was that in October 1996, the U.S. Defense Advanced Research Projects Agency funded Kesselman and Foster’s Globus project to provide a solid foundation for grid computing. At the 1997 supercomputer conference, Foster and Kesselman demonstrated a grid with some 80 sites worldwide running Globus software-another feat that, in Foster’s view, “convinced people that grid computing was worthwhile and real.” At that point, moreover, Foster and Kesselman had even started to call it “grid computing,” playing on the analogy to the electrical grid.