Today’s top-of-the-line computers have dual-core processors: two computing units that can handle separate tasks at the same time. And by next year, major chip makers Intel and AMD will have rolled out quad-core systems. Although multiple processors are theoretically faster than a single core, writing software that takes advantage of many processors–a task called parallel programming–is extremely difficult.
Recent research from MIT, however, could make parallel programming easier, ultimately helping to keep personal-computing performance on track. The researchers are proposing a new computing framework that combines specialized software instructions and modifications to multi-core hardware that could allow programmers to write software without having to deal with some tedious parallel-programming details.
Historically, writing software for multi-core systems has been the job of experts in the supercomputing world. But with the coming age of personal supercomputers, average programmers also need to be able to write software with multiple cores in mind.
“That’s a scary thing,” says Krste Asanovic, professor of electrical engineering and computer science at MIT, “because most have never done that, and it’s quite difficult to do.” Asanovic and his colleagues are tackling one of the main challenges that programmers face when they try to write software that will run efficiently on multi-core systems: coordinating multiple tasks that run on separate cores in a way that doesn’t cause the system to crash.
When an application such as Microsoft Outlook or a video player is parallelized, certain tasks are divvied up among the processors. But often, these separate tasks need to dip into a shared memory cache to access data. When one transaction is accessing memory and another transaction needs to access the same part of the memory, and proper safeguards aren’t put in place, a system can crash. This can be compared to a couple with a shared checking account with limited funds writing checks simultaneously and inadvertently overdrawing from the account.
Standard parallel programming requires a programmer to anticipate these simultaneous activities and make sure that once a certain activity begins to access memory, it “locks” out other activities so they wait until the transaction is completed.
When implemented correctly, the locks speed up parallel systems, but putting them into practice is complicated, says Jim Larus, research area manager at Microsoft. For instance, he explains, two different applications could acquire locks at the same time, which forces them to wait for each other. Without some third party coming in to break up the “deadlock,” Larus says, the applications would stay frozen.
The MIT researchers get around this by using an approach called transactional memory, a research area that has exploded in the past five years, says Asanovic. Transactional memory coordinates software operations so that programmers don’t have to write it into their programs. It actually allows numerous transactions to share the same memory at the same time. When a transaction is complete, the system verifies that other transactions haven’t made changes in the memory that would hinder the outcome of the first transaction. If they have, then the transaction is re-executed until it succeeds.
While transactional memory works in some cases, it’s still not perfect, explains Asanovic. Most of the time the transactions are small, and the fixed size of the memory in the hardware can cope with them quickly. But, he says, once in a while transactions require more memory than the fixed amount that is available, and when this happens, the system crashes. Asanovic says that by adding a small backup memory cache to the hardware, and by adding software to recognize when the transactions are overflowing, the capacity of transactional memory can be increased, alleviating previous system failures.
The method that the MIT researchers use relies on a combination of software and hardware to make transactional memory better, says Microsoft’s Larus, and there have been numerous designs that rely on software or hardware to varying degrees. “It’s not clear yet where the right line is” between using hardware and software to solve the problem, he says, but the researchers are tackling important unresolved issues in programming multi-core systems.
Microsoft, AMD, Intel, and universities such as MIT and Stanford, among others, are all invested in making multi-core systems easier to program. In addition to improving transactional memory, researchers are exploring better ways of debugging parallel programs and also creating libraries of ready-made parallel operations so that programmers can plug chunks of code into software without having to work out the kinks each time.
Currently, the dual-core systems aren’t as affected by the lack of truly parallel programs as the coming quad-core systems will be, says Asanovic. For the most part, operating systems such as Windows and Mac OS X are able to effectively split up applications on a dual-core system. For instance, a virus scanner runs unobtrusively in the background on one core, while applications such as Microsoft Word or Firefox run on the other core, without their speed being hampered.
But when it comes to 4, 8, or 16 cores, the applications themselves need to be modified to garner more performance. Asanovic says transactional memory won’t be a silver bullet that makes it easier to program these systems, but he expects it to be a component of the future parallel-computing model. “It’s one mechanism that appears to be useful,” he says.