AI for cybersecurity is a hot new thing—and a dangerous gamble

Machine learning and artificial intelligence can help guard against cyberattacks, but hackers can foil security algorithms by targeting the data they train on and the warning flags they look for.

Martin Gilesarchive page

August 11, 2018

Image of code overlaid on a photo of hands on a laptop keyboardUnsplash

When I walked around the exhibition floor at this week’s massive Black Hat cybersecurity conference in Las Vegas, I was struck by the number of companies boasting about how they are using machine learning and artificial intelligence to help make the world a safer place.

But some experts worry vendors aren’t paying enough attention to the risks associated with relying heavily on these technologies. “What’s happening is a little concerning, and in some cases even dangerous,” warns Raffael Marty of security firm Forcepoint.

The security industry’s hunger for algorithms is understandable. It’s facing a tsunami of cyberattacks just as the number of devices being hooked up to the internet is exploding. At the same time, there’s a massive shortage of skilled cyber workers (see “Cybersecurity’s insidious new threat: workforce stress”).

Using machine learning and AI to help automate threat detection and response can ease the burden on employees, and potentially help identify threats more efficiently than other software-driven approaches.

Data dangers

But Marty and some others speaking at Black Hat say plenty of firms are now rolling out machine-learning-based products because they feel they have to in order to get an audience with customers who have bought into the AI hype cycle. And there’s a danger that they will overlook ways in which the machine-learning algorithms could create a false sense of security.

Many products being rolled out involve “supervised learning,” which requires firms to choose and label data sets that algorithms are trained on—for instance, by tagging code that’s malware and code that is clean.

Marty says that one risk is that in rushing to get their products to market, companies use training information that hasn’t been thoroughly scrubbed of anomalous data points. That could lead to the algorithm missing some attacks. Another is that hackers who get access to a security firm’s systems could corrupt data by switching labels so that some malware examples are tagged as clean code.

The bad guys don’t even need to tamper with the data; instead, they could work out the features of code that a model is using to flag malware and then remove these from their own malicious code so the algorithm doesn’t catch it.

One versus many

In a session at the conference, Holly Stewart and Jugal Parikh of Microsoft flagged the risk of overreliance on a single, master algorithm to drive a security system. The danger is that if that algorithm is compromised, there’s no other signal that would flag a problem with it.

To help guard against this, Microsoft's Windows Defender threat protection service uses a diverse set of algorithms with different training data sets and features. So if one algorithm is hacked, the results from the others—assuming their integrity hasn’t been compromised too—will highlight the anomaly in the first model.

Beyond these issues. Forcepoint's Marty notes that with some very complex algorithms it can be really difficult to work out why they actually spit out certain answers. This “explainability” issue can make it hard to assess what’s driving any anomalies that crop up (see “The dark secret at the heart of AI”).

None of this means that AI and machine learning shouldn’t have an important role in a defensive arsenal. The message from Marty and others is that it’s really important for security companies—and their customers—to monitor and minimize the risks associated with algorithmic models.

That’s no small challenge given that people with the ideal combination of deep expertise in cybersecurity and in data science are still as rare as a cool day in a Las Vegas summer.

Deep Dive

Computing

Inside the hunt for new physics at the world’s largest particle collider

The Large Hadron Collider hasn’t seen any new particles since the discovery of the Higgs boson in 2012. Here’s what researchers are trying to do about it.

Dan Garistoarchive page

How ASML took over the chipmaking chessboard

MIT Technology Review sat down with outgoing CTO Martin van den Brink to talk about the company’s rise to dominance and the life and death of Moore’s Law.

How Wi-Fi sensing became usable tech

After a decade of obscurity, the technology is being used to track people’s movements.

Meg Duffarchive page

Algorithms are everywhere

Three new books warn against turning into the person the algorithm thinks you are.

Bryan Gardinerarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

AI for cybersecurity is a hot new thing—and a dangerous gamble

Data dangers

One versus many

Deep Dive

Computing

Inside the hunt for new physics at the world’s largest particle collider

How ASML took over the chipmaking chessboard

How Wi-Fi sensing became usable tech

Algorithms are everywhere

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

Data dangers

One versus many

Deep Dive

Computing

Inside the hunt for new physics at the world’s largest particle collider

How ASML took over the chipmaking chessboard

How Wi-Fi sensing became usable tech

Algorithms are everywhere

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review