“In the cloud, we have the opportunity for integrated design, where one entity can make design choices across an entire environment,” says Russ Daniels, chief technology officer of HP’s cloud-services division. “This way, we can understand the impact of design choices that we make at the infrastructure level, as well as the impact they have on higher-level systems.”
HP, for example, will be focusing in part on an ongoing project called Cells as a Service, an effort to create secure virtual “containers” that are composed of virtual machines, virtual storage volumes, and virtual networks. The containers can be split between separate data centers but still treated by consumers as a traditional, real-world collection of hardware.
Among Yahoo’s specific projects will be the development of Hadoop, an open-source software platform for creating large-scale data-processing and data-querying applications. Yahoo has already built one big cloud-computing facility called M45 that is operated in conjunction with Carnegie Mellon University. M45 will also be folded into this new project.
Running in parallel with this systems-level research will be the assortment of other research projects designed to test the cloud infrastructure.
Computer scientists at the Illinois facility have a handful of data- and processing-intensive projects under way that are likely to be ported to the cloud facilities. According to Heath, one key thrust will be “deep search” and information extraction, such as allowing a computer to understand the real-world context of the contents found in a Web page. For example, today’s search engines have difficulty understanding that a phone number is in fact an active phone number, rather than just a series of digits. A project run by Urbana-Champaign professor Kevin Chang is exploring the idea of using the massive quantities of data collected by Web-wide search engines as a kind of cross-reference tool, so that the joint appearance of “555-1212” with “John Borland” multiple times online might identify the number as a phone number and associate it with that particular name.
Heath says that other projects might include experiments with tele-immersive communication–virtual-reality-type environments that let computers provide physical, or haptic, feedback to users as they communicate or engage in real-world activities controlled remotely over the Web.
In an e-mail, Intel Research vice president Andrew Chein said that other topics could include climate modeling, molecular biology, industrial design, and digital library research.
“By looking at what people are really doing, we will learn about what is really important from an infrastructure perspective,” says Raghu Ramakrishnan, chief scientist for Yahoo’s Cloud Computing and Data Infrastructure Group. “We already know enough to put forth systems that are usable today, but not enough that we can deliver on all the promise that people see in the paradigm.”