Verifying Feedforward Neural Networks for Classification in Isabelle/HOL

Brucker, AD; Stell, A

Abstract

Neural networks are being used successfully to solve classification problems, e.g., for detecting objects in images. It is well known that neural networks are susceptible if small changes applied to their input result in misclassification. Situations in which such a slight input change, often hardly noticeable by a human expert, results ...

Neural networks are being used successfully to solve classification problems, e.g., for detecting objects in images. It is well known that neural networks are susceptible if small changes applied to their input result in misclassification. Situations in which such a slight input change, often hardly noticeable by a human expert, results in a misclassification are called adversarial examples. If such inputs are used for adversarial attacks, they can be life-threatening if, for example, they occur in image classification systems used in autonomous cars or medical diagnosis. Systems employing neural networks, e.g., for safety or security-critical functionality, are a particular challenge for formal verification, which usually expects a formal specification (e.g., given as source code in a programming language for which a formal semantics exists). Such a formal specification does, per se, not exist for neural networks. In this paper, we address this challenge by presenting a formal embedding of feedforward neural networks into Isabelle/HOL and discussing desirable properties for neural networks in critical applications. Our Isabelle-based prototype can import neural networks trained in TensorFlow, and we demonstrate our approach using a neural network trained for the classification of digits on a dot-matrix display.

Verifying Feedforward Neural Networks for Classification in Isabelle/HOL

Computer Science

Faculty of Environment, Science and Economy