It’s official: China’s next supercomputer, the petascale Dawning 6000, will be constructed exclusively with home-grown microprocessors. Weiwu Hu, chief architect of the Loongson (also known as “Godson”) family of CPUs at the Institute of Computing Technology (ICT), a division of the Chinese Academy of Sciences, also confirms that the supercomputer will run Linux. This is a sharp departure from China’s last supercomputer, the Dawning 5000a, which debuted at number 11 on the list of the world’s fastest supercomputers in 2008, and was built with AMD chips and ran Windows HPC Server.
The arrival of Dawning 6000 will be an important landmark for the Loongson processor family, which to date has been used only in inexpensive, low-power netbooks and nettop PCs. When the Dawning 5000a was initially announced, it too was meant to be built with Loongson processors, but the Dawning Information Industry Company, which built the computer, eventually went with AMD chips, citing a lack of support for Windows, and the ICT’s failure to deliver a sufficiently powerful chip in time.
The Dawning 6000 will be completed by mid-2010 at the latest, says Hu, and could be up and running as early as the end of 2010. It is the second time that a representative from the ICT has promised a supercomputer built entirely using Loongson processors.
The development of Loongson 3 began in 2001 as a product of China’s 10th five-year program. All of the chips in the Loongson family are based on the MIPS instruction set–originally developed in the 1980s but now out of favor in desktop and server computers, although still used in many embedded devices. Currently, the Top 500 list is dominated by x86 chips, with non-x86 CPUs powering less than 15 percent of the high-performance systems on the list.
“This is a very high-performance MIPS architecture where, when it’s run in a cluster configuration, it becomes very powerful,” says Art Swift, vice president of marketing at Sunnyvale, CA-based MIPS Technologies, which developed the MIPS architecture.
A paper published in 2009 proposes using Loongson 3 chips in clusters of up to 16 cores to accomplish extremely high performance. Tom Halfhill, analyst at Microprocessor Report, calculates that in this configuration, meeting the petaflop performance mark (one quadrillion operations per second) could require as few as 782 16-core chips.
Halfhill says the Loongson 3 is little different from the latest-generation chip, Loongson 2F, which is already available in consumer PCs. The main differences are that it includes hardware translation of x86 instructions (used in most of the microprocessors made by Intel and AMD), and it incorporates multiple cores–from four up to a proposed 16–each capable of processing commands independently. Conspicuously absent from the Loongson 3 is multithreading, which allows a single core to execute multiple instructions simultaneously. (Both Intel and Sun have already incorporated multithreading into some of their chips.)
Generations 2 and 3 of the Loongson use the same general-purpose core, but the Loongson 3 tethers more cores together. A quad-core Loongson 3 chip is currently in prototype, and a final, 64-nanometer version of the chip was “taped out” in late December, meaning the final description of the chip will soon be sent to the manufacturer, STMicroelectronics.