Software That Fixes Itself

A new tool aims to fix misbehaving programs without shutting them down.

Erica Naonearchive page

October 29, 2009

Martin Rinard, a professor of computer science at MIT, is unabashed about the ultimate goal of his group’s research: “delivering an immortal, invulnerable program.” In work presented this month at the ACM Symposium on Operating Systems Principles in Big Sky, MT, a group of MIT researchers, led by Rinard and Michael Ernst, who is now an associate professor at the University of Washington, developed software that can find and fix certain types of software bugs within a matter of minutes.

When a potentially harmful vulnerability is discovered in a piece of software, it takes nearly a month on average for human engineers to come up with a fix and to push the fix out to affected systems, according to a report issued by security company Symantec in 2006. The researchers, who collaborated with a startup called Determina on the work, hope that the new software, called ClearView, will speed this process up, making software significantly more resilient against failure or attack.

ClearView works without assistance from humans and without access to a program’s underlying source code (an often proprietary set of instructions that defines how a piece of software will behave). Instead, the system monitors the behavior of a binary: the form the program takes in order to execute instructions on a computer’s hardware.

By observing a program’s normal behavior and assigning a set of rules, ClearView detects certain types of errors, particularly those caused when an attacker injects malicious input into a program. When something goes wrong, ClearView detects the anomaly and identifies the rules that have been violated. It then comes up with several potential patches designed to force the software to follow the violated rules. (The patches are applied directly to the binary, bypassing the source code.) ClearView analyzes these possibilities to decide which are most likely to work, then installs the top candidates and tests their effectiveness. If additional rules are violated, or if a patch causes the system to crash, ClearView rejects it and tries another.

ClearView is particularly effective when installed on a group of machines running the same software. In that case, what ClearView learns from errors on one machine is used to fix all the others. Because it doesn’t require access to source code, Rinard says that ClearView could be used to fix programs without requiring the cooperation of the company that made the software, or to repair programs that are no longer being maintained. He hopes the system could extend the life of older versions of software, created by companies that have gone out of business, in addition to protecting current software.

To test the system, the researchers installed ClearView on a group of computers running Firefox and hired an independent team to attack the Web browser. The hostile team used 10 different attack methods, each of which involved injecting some malicious code into Firefox. ClearView successfully blocked all of the would-be attacks by detecting misbehavior and terminating the application before the attack could have its intended effect. The very first time ClearView encounters an exploit it closes the program and begins analyzing the binary, searching for a patch that could have stopped the error.

For seven of the attacking team’s approaches, ClearView created patches that corrected the underlying errors. In all cases, it discarded corrections that had negative side effects. On average, ClearView came up with a successful patch within about five minutes of its first exposure to an attack.

“What this research is leading us to believe is that software isn’t in itself inherently fragile and brittle because of errors,” says Rinard. “It’s fragile and brittle because people are afraid to let the software continue if they think there’s something wrong with it.” Some software engineering approaches, such as “failure-oblivious computing” or “acceptable computing,” share this philosophy.

ClearView “is a really good starting point,” says Yuanyuan Zhou, a professor of computer of science at the University of California, San Diego, who also researches software dependability. Zhou lauds the evaluation process the researchers used for the project but says she wants to see ClearView tested on a wider variety of applications.

“Keeping the system going at all costs does seem to have merit,” adds David Pearce, a senior lecturer in computer science at Victoria University in Wellington, New Zealand. He points out that ClearView is designed to apply patches whenever it detects that something has gone wrong. Some systems are designed to shut down when an error is detected, but if an attacker’s goal is sabotage, Pearce says, this approach plays right into their hands.

But ClearView’s approach could result in some hiccups for the user, Pearce adds. For example, if a Web browser had a bug that made it unable to handle URLs past a certain length, ClearView’s patch might protect the system by clipping off the ends of URLs that were too long–preventing the program from failing, but also preventing it from working fully. However, such issues probably wouldn’t be outright harmful. “It’s generally only hackers that attempt to exploit such loopholes,” says Pearce, “and they would be the ones who suffered.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.