fbpx
Adversarial Attacks on Deep Neural Networks Adversarial Attacks on Deep Neural Networks
Our deep neural networks are powerful machines, but what we don’t understand can hurt us. As sophisticated as they are, they’re... Adversarial Attacks on Deep Neural Networks

Our deep neural networks are powerful machines, but what we don’t understand can hurt us. As sophisticated as they are, they’re highly vulnerable to small attacks that can radically change their outputs. As we go deeper into the capabilities of our networks, we must examine how these networks really work to build more robust security.

At ODSC East 2019, Sihem Romdhani of Veeva Systems outlined how these networks are still highly vulnerable despite their power and how it’s precisely their mysterious operations that makes it so challenging to build safer networks. We can’t continue to rush towards bigger, deeper models without sufficient security or we will pay the price.

What is an Adversarial Attack?

Humans are great at filtering out noise and perturbations. However, deep neural networks are extremely literal, and it still takes very little noise to fool a trained network. While we may agree that two pictures are indeed pigs, a small amount of noise imperceptible to our eye could cause the computer to believe that one is a pig and one is an airliner.

The most common type of neural network is a convolutional neural network. It uses connected layers to create a classification through training. We can manipulate images by using our knowledge of the training model and the purpose of the attack. A targeted attack, for example, manipulates the input images to change the classifier. The input can be used to cause the machine to see what the attacker wants. In some cases, it’s possible to accomplish this by changing only one pixel.

These attacks aren’t noise. Noise is random or uncontrolled interference. Hackers can control perturbations so that they aren’t detectable to standard noise filters. This is what makes these attacks so dangerous.

[Related article: The Importance of Explainable AI]

3D Images Are Also Vulnerable

Some claim that this isn’t a problem with 3D object recognition, but can we really secure the model? No, unfortunately.  A recent experiment showed that even a 3D printed turtle could be predicted as a rifle by changing only a few things.

Most of the attacks had access to the model parameters, so could hiding those make these models safer? Again, no.

Black box attacks can fool deep neural networks even when the parameters of the training data are hidden. By observing the output, you can still fool those deep neural networks. Even more worrying, adversarial attacks can often transfer across models as long as those models are trained for the same action.

The substitute model technique is one example of this. You can generate your input data and query your black box model and train your substitute model. Once you’ve trained the substitute model, you can generate adversarial attacks by adding small perturbation inputs because the substitute model approximates the decision boundary of your unknown target black box model.

Other Adversarial Attacks Beyond Image Data

Voice and text data is more difficult, but not impossible. In speech recognition, for example, you can add a small noise that approximates a perturbation in image data. It causes a wrong prediction or a prediction specifically desired by the attacker. If you have a device like Alexa, an attacker could possibly have full access to your system by merely listening to music at your home.

Natural language understanding is also at risk. In sentiment analysis, it’s possible to change a few characters, sometimes even one character, and flip the sentiment from positive to negative. You could launch an attack on a business or organization simply through altering the analysis of posts.

[Related article: Innovators and Regulators Collaborate on Book Tackling AI’s Black Box Problem]

Making More Robust Models

So how do we begin to make our models more robust in response to these adversarial attacks? Here are some of the steps Romdhani suggests taking to harness the full power of deep neural networks while making them more secure.

Modifying training of the model

Adversarial training has promising results for making a more robust model. At each training iteration, you use the state of your model to generate an adversarial example with the original input to alter the model training. It increases the robustness and reduces overfitting.

Modifying the network

Network distillation could also help make the network more robust. It’s also for transferring knowledge from complicated networks to smaller networks like what’s deployed in smartphones, for example. You use the knowledge of the model itself to increase security. First, train your model and the label is represented as a hot vector. Transfer the probability to increase the robustness. Use the same input but with generated probabilities.

Adding on Networks

Perturbation Rectifying Networks apply real and synthetic image perturbations to redefine how the networks themselves are trained. The query image passes through the system and the detector verifies it. If there’s a perturbation, use the output of the PRN instead of the actual image for label recognition. Then, append the extra layers to the input. We use this method mainly for detecting universal perturbations.

Moving Forward With Security

We need more open source platforms to help evaluate the networks specifically against adversarial attacks. We also need more data to test networks against perturbation types. Finally, we need to understand how these networks work. As long as they remain a black box, it can be difficult for us to identify the weakness. If we dive more deeply into how they work, we may be able to unlock how these attacks can unravel the training to alter decisions.

These are powerful networks, but we must focus our efforts onto security and reliability instead of answering questions. It’s tempting to continue to use the magic of deep neural networks to unlock our desires for bigger, better AI, but spending time understanding our creations could help us build more secure systems.

Elizabeth Wallace, ODSC

Elizabeth is a Nashville-based freelance writer with a soft spot for startups. She spent 13 years teaching language in higher ed and now helps startups and other organizations explain - clearly - what it is they do. Connect with her on LinkedIn here: https://www.linkedin.com/in/elizabethawallace/

1