The Million-Core Problem

Stanford researchers break a supercomputing barrier.

David Zaxarchive page

January 30, 2013

A team of Stanford researchers have broken a record in supercomputing, using a million cores to model a complex fluid dynamics problem. The computer is a newly installed Sequioa IBM Bluegene/Q system at the Lawrence Livermore National Laboratories. Sequoia has 1,572,864 processors, reports Andrew Myers of Stanford Engineering, and 1.6 petabytes of memory.

What do you need a million cores for? Apparently, for the complex problem of modeling supersonic jet noise. Joseph Nichols, a researcher at Stanford Engineer’s Center for Turbulence Research, led the effort. Naturally, engineers would like to design engines that don’t make as much noise–it’d be a boon for airport workers, airports, and the communities around them. Simulating new engine designs is one way to do this–but doing so is extraordinarily complex, requiring many, many processors to do so quickly.

“The waves propagating throughout the simulation require a carefully orchestrated balance between computation, memory and communication,” is how Myers puts it. The complicated math problem Nichols wanted to run was divided up into smaller parts for the million-plus cores to work on. The staff in charge of the supercomputer wasn’t even sure if “full-system scaling” of the type Nichols wanted to try would work properly–but it passed with flying colors. (For more on a different though equally exciting frontier of supercomputing, see “Computing with Light.”)

Wired’s Klint Finley helpfully explains what makes Sequoia different: its cores are networked in a new way. It gets a little technical: “Each processor is directly connected to ten other processors, and can connect, with lower latency, to processors further away. But some of those processors also have an 11th connection, which taps into a central input/output channel for the entire system. These special processors collect signals from the processors and write the results to disk.”

Finley points out that while open source platforms like Hadoop can help bring distributed computing of a sort to the masses, there’s no replacement for dedicated hardware of the kind Sequoia provides.

Sequoia was once the fastest supercomputer; it was recently surpassed, though, by an Oak Ridge computer called Titan, a Cray XK7. SingularityHub points out that speed and number of cores are important, but not everything: we need to appreciate the value of efficiency gains, too. Sequoia was recently ranked 29^th by Green500 in the efficiency department.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.