How To Ensure Your Machine Learning Models Aren’t Fooled
Machine learning models are not infallible. In order to prevent attackers from exploiting a model, researchers have designed various techniques to make machine learning models more robust.
All neural networks are susceptible to “adversarial attacks,” where an attacker provides an example intended to fool the neural network. Any system that uses a neural network can be exploited. Luckily, there are known techniques that can mitigate or even prevent adversarial attacks completely. The field of adversarial machine learning is growing rapidly as companies realize the dangers of adversarial attacks.
We will look at a brief case study of face recognition systems and their potential vulnerabilities. The attacks and counters described here are somewhat general, but face recognition offers easy and understandable examples.
Face Recognition Systems
With the increasing availability of big data for faces, machine learning methods like deep neural networks become extremely appealing due to ease of construction, training, and deployment. Face recognition systems (FRS) based on these neural networks inherit the network’s vulnerabilities. If left unaddressed, the FRS will be vulnerable to attacks of several forms.
Physical Attacks
The simplest and most obvious attack is a presentation attack, where an attacker simply holds a picture or video of the target person in front of the camera. An attacker could also use a realistic mask to fool an FRS. Though presentation attacks can be effective, they are easily noticed by bystanders and/or human operators.
A more subtle variation on the presentation attack is a physical perturbation attack. This consists of an attacker wearing something specially crafted to fool the FRS, e.g. a specially colored pair of glasses. Though a human would correctly classify the person as a stranger, the FRS neural network may be fooled.
Digital Attacks
Face recognition systems are much more vulnerable to digital attacks. An attacker with knowledge of the FRS’ underlying neural network can carefully craft an example pixel by pixel to perfectly fool the network and impersonate anyone. This makes digital attacks much more insidious than physical attacks, which in contrast are less efficacious and more conspicuous.
Digital attacks have several moieties. Though all relatively imperceptible, the most subliminal is the noise attack. The attacker’s image is modified by a custom noise image, where each pixel value is changed by at most 1%. The photo above illustrates this type of attack. To a human, the third image looks completely identical to the first, but a neural network registers it as a completely different image. This allows the attacker to go unnoticed by both a human operator and the FRS.
Other similar digital attacks include transformation and generative attacks. Transformation attacks simply rotate the face or move the eyes in a way intended to fool the FRS. Generative attacks take advantage of sophisticated generative models to create examples of the attacker with a facial structure similar to the target.
Possible Solutions
To properly address the vulnerabilities of face recognition systems and neural networks in general, the field of machine learning robustness comes into play. This field helps address universal issues with inconsistency in machine learning model deployment and provides answers as to how to mitigate adversarial attacks.
One possible way to improve neural network robustness is to incorporate adversarial examples into training. This usually results in a model that is slightly less accurate on the training data, but the model will be better suited to detect and reject adversarial attacks when deployed. An added benefit is that the model will perform more consistently on real world data, which is often noisy and inconsistent.
Another common way to improve model robustness is to use more than one machine learning model with ensemble learning. In the case of face recognition systems, multiple neural networks with different structures could be used in tandem. Different neural networks have different vulnerabilities, so an adversarial attack can only exploit the vulnerabilities of one or two networks at a time. Since the final decision is a “majority vote,” adversarial attacks cannot fool the FRS without fooling a majority of the neural networks. This would require significant changes to the image that would be easily noticeable by the FRS or an operator.
Conclusion
The exponential growth of data in various fields has made neural networks and other machine learning models great candidates for a plethora of tasks. Problems where solutions previously took thousands of hours to solve now have simple, elegant solutions. For instance, the code behind Google Translate was reduced from 500,000 lines to just 500.
These advancements, however, bring the dangers of adversarial attacks that can exploit neural network structure for malicious purposes. In order to combat these vulnerabilities, machine learning robustness needs to be applied to ensure adversarial attacks are detected and prevented.
Alex Saad-Falcon is a content writer for PDF Electric & Supply. He is a published research engineer at an internationally acclaimed research institute, where he leads internal and sponsored projects. Alex has his MS in Electrical Engineering from Georgia Tech and is pursuing a PhD in machine learning.
About the Author
You May Also Like