Skip to Content
Artificial intelligence

How malevolent machine learning could derail AI

AI security expert Dawn Song warns that “adversarial machine learning” could be used to reverse-engineer systems—including those used in defense.
March 25, 2019
Jeremy Portje

Artificial intelligence won’t revolutionize anything if hackers can mess with it.

That’s the warning from Dawn Song, a professor at UC Berkeley who specializes in studying the security risks involved with AI and machine learning.

Speaking at EmTech Digital, an event in San Francisco produced by MIT Technology Review, Song warned that new techniques for probing and manipulating machine-learning systems—known in the field as “adversarial machine learning” methods—could cause big problems for anyone looking to harness the power of AI in business.

Song said adversarial machine learning could be used to attack just about any system built on the technology.

“It’s a big problem,” she told the audience. “We need to come together to fix it.”

Adversarial machine learning involves experimentally feeding input into an algorithm to reveal the information it has been trained on, or distorting input in a way that causes the system to misbehave. By inputting lots of images into a computer vision algorithm, for example, it is possible to reverse-engineer its functioning and ensure certain kinds of outputs, including incorrect ones.

Song presented several examples of adversarial-learning trickery that her research group has explored.

One project, conducted in collaboration with Google, involved probing machine-learning algorithms trained to generate automatic responses from e-mail messages (in this case the Enron e-mail data set). The effort showed that by creating the right messages, it is possible to have the machine model spit out sensitive data such as credit card numbers. The findings were used by Google to prevent Smart Compose, the tool that auto-generates text in Gmail, from being exploited.

Another project involved modifying road signs with a few innocuous-looking stickers to fool the computer vision systems used in many vehicles. In a video demo, Song showed how the car could be tricked into thinking that a stop sign actually says the speed limit is 45 miles per hour. This could be a huge problem for an automated driving system that relies on such information.

Adversarial machine learning is an area of growing interest for machine-learning researchers. Over the past couple of years, other research groups have shown how online machine-learning APIs can be probed and exploited to devise ways to deceive them or to reveal sensitive information.

Unsurprisingly, adversarial machine learning is also of huge interest to the defense community. With a growing number of military systems—including sensing and weapons systems—harnessing machine learning, there is huge potential for these techniques to be used both defensively and offensively.

This year, the Pentagon’s research arm, DARPA, launched a major project called Guaranteeing AI Robustness against Deception (GARD), aimed at studying adversarial machine learning. Hava Siegelmann, director of the GARD program, told MIT Technology Review recently that the goal of this project was to develop AI models that are robust in the face of a wide range of adversarial attacks, rather than simply able to defend against specific ones.

Deep Dive

Artificial intelligence

conceptual illustration showing various women's faces being scanned
conceptual illustration showing various women's faces being scanned

A horrifying new AI app swaps women into porn videos with a click

Deepfake researchers have long feared the day this would arrive.

Conceptual illustration of a therapy session
Conceptual illustration of a therapy session

The therapists using AI to make therapy better

Researchers are learning more about how therapy works by examining the language therapists use with clients. It could lead to more people getting better, and staying better.

a Chichuahua standing on a Great Dane
a Chichuahua standing on a Great Dane

DeepMind says its new language model can beat others 25 times its size

RETRO uses an external memory to look up passages of text on the fly, avoiding some of the costs of training a vast neural network

THE BLOB, 1958, promotional artwork
THE BLOB, 1958, promotional artwork

2021 was the year of monster AI models

GPT-3, OpenAI’s program to mimic human language,  kicked off a new trend in artificial intelligence for bigger and bigger models. How large will they get, and at what cost?

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.