How Long Before AI Systems Are Hacked in Creative New Ways?

Research points to ways that machine-learning programs could be tricked into doing unwanted things.

Will Knightarchive page

December 15, 2016

The latest artificial-intelligence techniques are being adopted by companies at a blistering pace. Before long, hackers might start taking a closer look, too, and they could cause all sorts of trouble by tricking these systems with illusory data.

Speaking at a recent AI conference in Barcelona, Spain, Ian Goodfellow, a research scientist at OpenAI who has done pioneering work on deceiving machine-learning systems, said attacking the systems is easy. “Almost anything bad you can think of doing to a machine-learning model can be done right now,” he said. “And defending it is really, really hard.”

In the last few years, researchers have demonstrated various ways in which machine-learning programs could be manipulated by exploiting their propensity to spot patterns in data. They are vulnerable, in part, because they lack actual intelligence. For instance, it is possible to use a billboard to trick the vision systems on self-driving cars into seeing things that aren’t there. Inaudible signals can trick voice-controlled assistants into taking unwanted actions, like visiting a website and downloading a piece of malware.

Goodfellow and others are developing countermeasures. It is possible to train a machine-learning system to recognize and then ignore misleading examples. But it is tricky to protect against every possible assault.

Fooling machine-learning systems may become more than an academic exercise. “This is very real,” says Patrick McDaniel, a professor at Pennsylvania State University who has explored the issue. “Machine-learning systems are driving all kinds of functions that could be monetized by adversaries, and so organized and sophisticated attackers will embrace these attacks.”

McDaniel points out that hackers have been outwitting machine-learning systems for years. Spammers, for instance, have fed learning algorithms with false e-mails to enable spam messages to pass through later. He says it may not be long before more sophisticated attacks emerge.

“The first attacks will come very soon against online classification systems,” McDaniel says. This could include modern spam filters, systems designed to detect illicit or copyright material, and advanced machine-learning-based computer security systems.

A new paper suggests that the problem could be more widespread than previously known. It shows that certain deceptions can be reused against different machine-learning systems, or even against a large “black box” system about which an attacker does not have prior knowledge.

Bugs lurking in these popular machine-learning tools could provide another way to target them. New machine-learning tools are developing at a rapid pace, and are often released for free online before being employed in active services such as image recognition or natural language analysis tools.

Speaking at the same conference in Spain, Octavian Suciu, a PhD student at the University of Maryland, highlighted a number of such vulnerabilities in some popular tools. Suciu analyzed the source code for these programs, and he found it could be manipulated. He found problems with the way some tools store information in memory, meaning that feeding in a very large piece of data could overwrite part of the program, changing its behavior.

Suciu speculates that the approach could provide a handy way to manipulate, for example, a tool that offers stock predictions, which could then be used to short the market. “If [a model] tells you that the stock will go up, you could change the prediction to say that it would go down,” he says.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.