Input Validation for Neural Networks via Runtime Local Robustness Verification

Local robustness verification can verify that a neural network is robust wrt. any perturbation to a specific input within a certain distance. We call this distance Robustness Radius. We observe that the robustness radii of correctly classified inputs are much larger than that of misclassified inputs which include adversarial examples, especially those from strong adversarial attacks. Another observation is that the robustness radii of correctly classified inputs often follow a normal distribution. Based on these two observations, we propose to validate inputs for neural networks via runtime local robustness verification. Experiments show that our approach can protect neural networks from adversarial examples and improve their accuracies.

[1]  Ashish Tiwari,et al.  SOTER: Programming Safe Robotics System using Runtime Assurance , 2018, ArXiv.

[2]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[3]  Jun Sun,et al.  Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing , 2018, ArXiv.

[4]  Davide Castelvecchi,et al.  Can we open the black box of AI? , 2016, Nature.

[5]  David A. Forsyth,et al.  SafetyNet: Detecting and Rejecting Adversarial Examples Robustly , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[7]  Zoubin Ghahramani,et al.  Adversarial Examples, Uncertainty, and Transfer Testing Robustness in Gaussian Process Hybrid Deep Networks , 2017, 1707.02476.

[8]  Christian Gagné,et al.  Robustness to Adversarial Examples through an Ensemble of Specialists , 2017, ICLR.

[9]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[10]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[11]  E. S. Pearson,et al.  Tests for departure from normality. Empirical results for the distributions of b2 and √b1 , 1973 .

[12]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[13]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[14]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[17]  Patrick D. McDaniel,et al.  On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[18]  Pavlos Protopapas,et al.  Finding anomalous periodic time series , 2009, Machine Learning.

[19]  Qiang Xu,et al.  Towards Imperceptible and Robust Adversarial Example Attacks against Neural Networks , 2018, AAAI.

[20]  Hao Chen,et al.  MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[21]  Timon Gehr,et al.  Boosting Robustness Certification of Neural Networks , 2018, ICLR.

[22]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[23]  Pan He,et al.  Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Junfeng Yang,et al.  Formal Security Analysis of Neural Networks using Symbolic Intervals , 2018, USENIX Security Symposium.

[25]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[26]  Michael I. Jordan,et al.  HopSkipJumpAttack: A Query-Efficient Decision-Based Attack , 2019, 2020 IEEE Symposium on Security and Privacy (SP).

[27]  Matthew Mirman,et al.  Fast and Effective Robustness Certification , 2018, NeurIPS.

[28]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[29]  Jinfeng Yi,et al.  Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach , 2018, ICLR.

[30]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[31]  Somesh Jha,et al.  Analyzing the Robustness of Nearest Neighbors to Adversarial Examples , 2017, ICML.

[32]  Ashish Tiwari,et al.  Output Range Analysis for Deep Feedforward Neural Networks , 2018, NFM.

[33]  Timon Gehr,et al.  An abstract domain for certifying neural networks , 2019, Proc. ACM Program. Lang..

[34]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[35]  Min Wu,et al.  Safety Verification of Deep Neural Networks , 2016, CAV.

[36]  Ashish Tiwari,et al.  SOTER: A Runtime Assurance Framework for Programming Safe Robotics Systems , 2018, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[37]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[38]  Matteo Fischetti,et al.  Deep neural networks and mixed integer linear optimization , 2018, Constraints.

[39]  Thomas A. Henzinger,et al.  Outside the Box: Abstraction-Based Monitoring of Neural Networks , 2019, ECAI.

[40]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.