Practical Adversarial Attacks against Deep Learning Models

Wu, H

Abstract

Advances in deep neural networks have opened a new era of robotics, intelligent robots. Compared with traditional robots that perform repetitive tasks based on manual control or predefined rules, intelligent robots possess a more comprehensive perception of environments and make more sophisticated decisions for various tasks. For ...

Advances in deep neural networks have opened a new era of robotics, intelligent robots. Compared with traditional robots that perform repetitive tasks based on manual control or predefined rules, intelligent robots possess a more comprehensive perception of environments and make more sophisticated decisions for various tasks. For example, autonomous vehicles, one type of mobile robot, rely on deep learning models to perceive their surroundings and navigate complex environments. However, it is no more a secret that deep learning models are vulnerable to adversarial attacks. Recent research unveils that deep neural networks can be fooled by adding human unperceivable perturbations to the input data, posing threats against autonomous vehicles that rely on deep neural networks to achieve image classification, object detection and tracking, etc. This thesis addresses the question: Can we develop practical adversarial attacks against deep learning applications? To answer the question, two real-time white-box attacks against the NVIDIA end-to-end driving model are presented. The end-to-end driving model takes images captured by the front camera as input and outputs the steering angle. We design both image-specific and image-agnostic attacks to alter the steering angle, intentionally deviating it from the model’s original output. For modular autonomous driving systems, we devise real-time white-box attacks against object detection models. These attacks generate human unperceivable perturbations of arbitrary shapes to fabricate objects at desired locations. We further introduce a Human-in-the-Middle hardware attack to inject Universal Adversarial Perturbation (UAP) into a USB camera. The evaluation results on the VOC2012 and CARLA autonomous driving datasets show that our attacks produce more stable false bounding boxes than in previous work. Additionally, the attack significantly reduces the tracking accuracy of the Tracking-By-Detection (TBD) framework. Lastly, we propose a distributed black-box attack to accelerate attacks on machine-learning cloud services. By targeting cloud APIs directly, rather than local models, we avoid mistakes made in prior research that obtained unfair advantages by applying the perturbation before image encoding and preprocessing.

Practical Adversarial Attacks against Deep Learning Models

Doctoral Theses

Doctoral College