Breaking the Botnet Code

Software that deciphers botnet communications could help infiltrate criminals’ networks.

Robert Lemosarchive page

November 11, 2009

Networks of compromised computers controlled by a central server, better known as botnets, are a Swiss Army knife of tools for online criminals. Hackers can use these co-opted systems to churn out spam, host malicious code, hide their tracks on the Internet, or flood a corporate network to cut off its access to the Web.

Whenever a new botnet appears, researchers race to reverse engineer the software it installs on a victim’s machine, and to decode the way each bot communicates with the controlling server. Because these communications are often encrypted, such analyses can take weeks or months. Now researchers from the University of California at Berkeley and Carnegie Mellon University have created a way to automatically reverse engineer the communications between compromised computers and their controlling servers.

In a paper to be presented this week at the Association for Computing Machinery’s Conference on Computer and Communications Security, the researchers show how automatic reverse engineering can decipher the structure and purpose of the communications between a command-and-control server and its bots.

“The communications protocol of the botnet is the core of the botnet,” says Juan Caballero, a PhD student affiliated with both the University of California at Berkeley and Carnegie Mellon University, and lead author of the paper. “That is how the attacker sends commands to the botnet.”

When researchers have previously tried to automatically analyze botnet communicationprotocols, they focused on deciphering the commands received by the client. Yet Caballero, together with UC Berkeley assistant professor Dawn Song and two other colleagues, has developed a technique that translates both the commands received by a client and the responses it sends.

The researchers then ran the botnet code on a virtual machine and analyzed the movement of information to and from a computer’s registers–memory components within a machine’s processor–before it was encrypted. Watching for changes in the memory registers–the researchers call this “buffer deconstruction”– allowed them to derive the structure of the botnet communications and infer the function of the various components of each command.

“This is relevant for malware, because we typically do not have the executable for the command-and-control server of a botnet,” said Paolo Milani, a postdoctoral researcher at the Secure System Lab at the Vienna Institute of Technology and author of an earlier paper on automated protocol analysis. “So with previous techniques, we would not be able to automatically reverse engineer the client side of the protocol.”

The researchers built the resulting technique into a tool, called Dispatcher, to analyze botnet network communications and even inject new information into the communications stream. The researchers tested the approach on a complex botnet known as MegaD, which made headlines in early 2008 when security firms noticed it was responsible for nearly a third of spam traffic worldwide.

The researchers analyzed 15 messages they’d collected by monitoring a MegaD bot: Seven commands sent from the control servers and eight responses from the bot. The Dispatcher tool analyzed the bot as it ran on the virtual machine and automatically detected the point at which the program decrypted commands but had not yet encrypted its responses.

Network administrators can also use the Dispatcher tool to infiltrate the botnet. MegaD clients typically will check to see if they can send e-mail, so as to become a useful cog in a spamming campaign. Because the researchers block all outgoing mail traffic, however, the client would normally send a message to the controlling server saying that its mail test failed. But the researchers modified the message en route, responding instead with the code for a successful spamming test.

“Normally, it would have sent a message saying that it can’t spam,” UC Berkeley’s Caballero says. “We [instead] actually got the spam template, so we could see what sort of spam it would send out.”

Tools such as Dispatcher could expand what is currently a small number of researchers that regularly reverse engineer botnets, says Joe Stewart, senior security researcher for SecureWorks, a network security firm. “It would solve a problem that the world has–having enough people to analyze botnets,” he says. “There are only so many people who can do reverse engineering on botnets. You have a cadre of enthusiasts who could use this to help them.”

Stewart adds, however, that experienced researchers don’t yet need such automated tools for analyzing most malware. While more complicated botnets can take weeks to reverse engineer, run-of-the-mill malware encountered by most companies and organizations is no problem at all. More than 90 percent of all botnets use easy-to-break encryption to protect their communications, making manual techniques relatively easy and fast.

“Not every (bot master) needs the MegaD-type encryption,” Stewart says. “I just don’t think it is worth their time, not with the effect we are having on them now, which is minimal.”

Yet botnets will continue to evolve, says UC Berkeley’s Song. “Botnet programs are becoming more complicated,” she says. “They are using various obfuscation techniques and so on. So maybe manual analysis can work for now, but in the future, we will need better tools.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.