Pitting Cloud against Cloud

Tools that benchmark performance promise to reveal the strengths and weaknesses of competing cloud providers.

Tom Simonitearchive page

July 16, 2010

New software developed to measure the performance of different cloud computing platforms could make it easier for prospective users to figure out which of these increasingly popular services is right for them.

Right now, developers have little means of comparing cloud providers, which lease access to computing power based in vast and distant data centers. Until actually migrating their software to a cloud service, they can’t know exactly how fast that service will perform calculations, retrieve data, or respond to sudden spikes in demand. But Duke University computer scientist Xiaowei Yang and her colleague Ang Li are trying to make the cloud market more like the car market, where, as Yang says, “you can compare specifications like engine size or top speed.”

Working with Srikanth Kandula and Ming Zhang of Microsoft Research in Redmond, WA, Yang and Li have developed a suite of benchmarking tools that make it possible to compare the performance of different cloud platforms without moving applications between them. These tools use algorithms to measure the speed of computation, and shuttle data around to test the speed at which new copies of an application are created, the speed at which data can be stored and retrieved, the speed at which it can be shuttled between applications inside the same cloud, and the responsiveness of a cloud to network requests from distant places. The researchers used the software to test the services offered by six providers: Amazon, Microsoft, Google, GoGrid, RackSpace and CloudSites. Results of those tests were combined with the providers’ pricing models to allow for quick comparisons.

The results are among the first attempts to compare the performance of several clouds platforms, says Yang. “We found that it’s very hard to find a provider that is best in all metrics,” she says. “Some are twice as fast for just 10 percent extra cost, which is a very good deal, but at the same time their storage service is actually very slow and has a lot of latency variation.” Another provider showed good computation speeds but was less quick at spawning new instances of an application–something that might be necessary for a service that experiences peaks in demand, as a video site does when some of its content goes viral. “It seems like in today’s market it is hard to pick a provider that is good at everything,” says Yang.

The researchers aren’t yet willing to disclose the performance scores of specific providers, but they plan to make their tools publicly available. “We are building a website where people will be able to download the software we used and see the results of the benchmarks,” says Yang. “We gathered our initial data by running the trials for a few hours, sometimes over two days. But ideally they should run every day to provide live data on the clouds’ performance.”

“Porting your application to four different cloud providers with different APIs and deployment methods is tedious, expensive, and error prone,” says Edward Walker, a researcher at Whitworth University in Washington, who separately developed a way to measure how good one of Amazon’s cloud services was at crunching scientific data. “A tool that allows you to automatically predict the performance of your application on a cloud platform is useful.”

Greater transparency in the cloud market could increase competition and ultimately result in better services for consumers, Walker says. However, he notes that the work Yang and her colleagues have done so far applies only to relatively simple Web applications. “More work needs to be done into investigating how feasible the framework will be for more complicated applications,” he says.

Yang agrees. “If you know you app is computationally intensive–for example, because it does a lot of sorting–you could choose a provider based on the metrics we gather on speed,” she says. “But if your app is 30 percent computation and 70 percent storage, it is harder to choose.”

Thus, she and her colleagues have started work on a more sophisticated way to test cloud services. They are building software that captures the performance of an application when running on a local server and then creates a dummy version of it on several clouds to compare how they hold up. The dummy doesn’t actually perform the function of the software, but it exerts the same computational, storage, and network demands. “Without actually migrating the app, I could just try out that representation of what it does,” says Yang.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.