Virus Counting Is a Numbers Game
Duck and cover.
That might be the first reaction to the latest statistics on the fast expansion of malicious software–“malware” for short–from antivirus firm McAfee. The company announced on Wednesday that it had found and analyzed more than 1.2 million unique malware programs in the first half of the year, compared to 500,000 in the first six months of 2008. In total, McAfee processed between 1.5 million and 1.6 million pieces of malware last year.
The data suggest that malware is on track to grow by 50 to 150 percent this year. Of course, the question remains: Just what is a unique piece of malware? Cybercriminals regularly compress their programs or obfuscate them in ways to elude detection by antivirus companies. In some cases, two victims can go to the same compromised website and get the same program packaged in two different ways, potentially leading an antivirus company to classify it as two different malicious programs.
“Literally, every time you hit the refresh button, you get a new build,” says David Marcus, director of security research and communications for McAfee.
McAfee defines unique malware as a program that requires a separate “driver,” which other companies might call a signature. There typically are three types of drivers: a generic driver that recognizes whole classes of viruses or Trojan horses, a heuristic driver that recognizes certain bad or exploitive behavior, and a specific driver for a single program–essentially a binary hash.
Each piece of malware has to go through McAfee’s analysis system, which manages workflow and automatically processes some 90 to 95 percent of the distinct files it receives from its installed base. Without the system, the company–and other antivirus firms–would be buried under an avalanche of code: McAfee’s analysis system, dubbed Artemis, must process tens of thousands of potentially malicious programs every day, of which nearly 7,000 are considered “new.”
Writing generic drivers to detect tens of thousands of the binary-distinct programs helps compress the sheer volume of the work, Marcus says.
“We are going to look at all the things we saw in a particular day, and ask, ‘How can we write a generic driver to detect all of those samples?’” he says. In addition, generic drivers help speed the firm’s virus-scanning engine.
In some ways, the expansion of viruses equates to a greater workload for antivirus firms’ analysts. Most of the firms have opened overseas analysis centers to help them deal with the multiplying challenges of classifying the malicious from the valid. In addition, better software tools–such as McAfee’s Artemis–help manage the workload more efficiently.
Yet, the numbers lie as well. Efforts to make McAfee’s database more efficient–three generic drivers are far faster to apply than 100,000 individual signatures–requires a great deal of work not reflected in the current tally. In fact, such work actually reduces the apparent increase in viruses and malicious code. “The 1.2 million that you are seeing is not inclusive of the generic and heuristic number,” Marcus says.
While Marcus acknowledges that such issues make it almost irrelevant to keep track of the numbers, he points out that customers want the data. “A customer wants to know how much stuff we are protecting them from today.” He adds, “At the end of the day, [cybercriminals] are writing more malware than they were a year ago.”
The inside story of how ChatGPT was built from the people who made it
Exclusive conversations that take us behind the scenes of a cultural phenomenon.
Sam Altman invested $180 million into a company trying to delay death
Can anti-aging breakthroughs add 10 healthy years to the human life span? The CEO of OpenAI is paying to find out.
ChatGPT is about to revolutionize the economy. We need to decide what that looks like.
New large language models will transform many jobs. Whether they will lead to widespread prosperity or not is up to us.
GPT-4 is bigger and better than ChatGPT—but OpenAI won’t say why
We got a first look at the much-anticipated big new language model from OpenAI. But this time how it works is even more deeply under wraps.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.