Researchers at the University of Michigan have developed software that hunts for flaws in chips and proposes the best way to fix them. Their approach tackles a growing problem for chip makers such as AMD and Intel. As transistors shrink and chips acquire more-complex designs, hardware bugs are becoming more prevalent. Currently, it can take up to a year to debug prototype chips and get them ready for mass production. The new software could shorten the time it takes to get a chip to market, cut costs by reducing the number of prototypes and testing cycles, and ultimately yield chips with fewer flaws.
“This is still an unsolved problem,” says Rob Rutenbar, a professor of electrical and computer engineering at Carnegie Mellon University, who adds that there is very little scientific literature on debugging silicon. “Intel might have some sophisticated technology, but they’re not talking about it. For all we know, people are doing it by hand,” Rutenbar says. “The sense that I get is that it’s not very well automated.”
Debugging by hand leaves more room for error. “Pretty much all chips, including microprocessors, are buggy,” says Igor Markov, a professor of electrical engineering and computer science at the University of Michigan. Intel’s website, for instance, lists about 130 known hardware bugs on commercial laptops. Most can be fixed with software downloads, but about 20 of them can’t be, Markov says, and they leave machines vulnerable to viruses.
Markov and his colleague Valeria Bertacco, professor of electrical engineering and computer science at Michigan, developed software that tackles the bug-fixing problem after the first round of prototypes has come back to the chip maker. “When you have a first version of a chip, it’s not ready to give to the consumer,” says Bertacco. Engineers need to try to run operating systems and software on it to see if it works, and this process can take anywhere from a couple of hours to a week, depending on the number of flaws in the chips.
“It’s very hard to figure out what is wrong,” Bertacco says. And once an engineer has identified a bug–which can be anything from wires spaced too closely together to misplaced transistors–it’s not always clear what the best fix will be. Often, engineers repair one problem only to discover in the next round of prototypes that their solutions have inadvertently added other flaws. Prototypes can take months to build, and they are expensive: changing the designs on the masks used to pattern layers of transistors and wires on the chips costs millions of dollars.
Currently, when a prototype comes back to a chip maker, engineers hook it up to electrical probes that send electrical signals through it and record the output, explains Bertacco. Different signals go to different parts of the chip, and by trying out thousands of signals, engineers can usually locate a problem. Then they propose a series of possible solutions. Sometimes they simply need to remove a connection between two wires in one of the upper layers of the chip. This can be done using equipment readily available in the lab, and the chip can quickly be retested. Other times, fixes are needed at lower layers within the chip, where the transistors make up logic gates. These transistors can’t be so easily adjusted and retested.