Skip to Content

New Benchmark Ranks Supercomputers More Thoughtfully

The Graph500 test measures “deep analysis” rather than ability to perform basic arithmetic.
December 14, 2010

There are dozens of ways to usefully measure the performance of the world’s fastest supercomputers – everything from the speed of a system’s internal communications network and its ability to randomly access memory, to the traditional measure of a high performance system’s speed: its ranking on the Linpack benchmark, which is used to “officially” rate systems on the international Top500 list.

In the Graph500 benchmark, supercomputers race each other to process every “edge” (connection) in an artificially-generated synthetic graph such as this one.
Credit: Jeremiah Willcock, Indiana University

Now a new ranking, proposed by a steering committee including 30 of the world’s top high performance computing (HPC) experts, has debuted: the Graph500 list, which measures a supercomputer’s ability to chew through “graph” data.

Graph data turns out to be the lingua franca of countless areas, from biomedicine (drug discovery, protein interactions, gene networks, public health) to homeland security, fraud detection, and the social graphs that represent users on networks like Facebook. All these fields, and countless others, could benefit from faster parsing of graphs.

The first-ever Graph500 list was unveiled in November, at Supercomputing 2010 in New Orleans. The list is so new that results from only 9 systems were submitted, with the Department of Energy’s IBM BlueGene/P powered Intrepid system, using 8192 of its 40960 nodes, coming out on top.

The Intrepid managed an impressive 6.6 Giga-edges per second (GE/s), which means it managed to leap from one node (or vertex) on an artificially generated graph to another (along the connections between them, known as edges) 6.6 billion times per second.

This level of speed is required for “traversing”, or stepping through, very large graphs, such as all the members (or vertices) of Facebook and all the connections (or edges) between them. Imagine, for example, that you want to know what is the shortest number of connections between any two members of Facebook – depending on the sophistication of the algorithm doing the searching, a computer might have to search many potential paths through the social graph to determine which members comprise the chain of degrees of separation between two nodes.

Interestingly, the third and fourth fastest systems on the Graph500 list – two Cray XMT supercomputers housed at Pacific Northwest National Laboratory and Sandia National Laboratories – managed their high ranking despite having relatively few processors: 128 apiece. At an average performance for each machine of 1.2 Giga edges per second, these Cray machine were more than ten times faster, as measured per node, than the IBM Blue Gene/P-powered Intrepid system.

That should be no surprise, when you consider that the Cray XMT is explicitly designed to tackle graph data. This specialization extends to the machine’s development environment.

“John Thale, who has the number three system [on the Graph500 list,] said it took him an hour or two to write the code [required to run the Graph500 benchmark,] whereas on a conventional system it takes days,” says Shoaib Mufti, director of knowledge management at Cray.

It’s this kind of specialization that will be required for tackling the “data-intensive computing” that will characterize the uses to which supercomputers will be put in the future. As Richard Murphy of Sandia National Laboratories, who led the founding of the Graph500 list, argued in a talk delivered in September, no amount of acceleration of conventional systems, including the increasingly popular GPGPU-supercomputers, can conquer the applications to which high performance systems will be put in the coming years: “deep analysis” of data designed to extract complicated relationships between data points.

This kind of computing, and not the kind measured by the traditional Linpack benchmark, will determine the true power of systems applied to graph data, Murphy argued in his talk, the slides of which included this interpretation of the bottom line for high performance systems:

This is a survival of the fittest question… the person with more complete data or better algorithms will win by:

- Being more profitable

- Beating you to the nobel prize

- Etc.

Follow Mims on Twitter or contact him via email.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.