A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability

In the past few years, significant progress has been made on deep neural networks (DNNs) in achieving human-level performance on several long-standing tasks. With the broader deployment of DNNs on various applications, the concerns over their safety and trustworthiness have been raised in public, especially after the widely reported fatal incidents involving self-driving cars. Research to address these concerns is particularly active, with a significant number of papers released in the past few years. This survey paper conducts a review of the current research effort into making DNNs safe and trustworthy, by focusing on four aspects: verification, testing, adversarial attack and defence, and interpretability. In total, we survey 202 papers, most of which were published after 2017.

[1]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[2]  David D. Cox,et al.  On the information bottleneck theory of deep learning , 2018, ICLR.

[3]  Edmund M. Clarke,et al.  Counterexample-Guided Abstraction Refinement , 2000, CAV.

[4]  Wen-Chuan Lee,et al.  MODE: automated neural network model debugging via state differential analysis and input selection , 2018, ESEC/SIGSOFT FSE.

[5]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[6]  Chih-Hong Cheng,et al.  Runtime Monitoring Neuron Activation Patterns , 2018, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[7]  Mingyan Liu,et al.  Spatially Transformed Adversarial Examples , 2018, ICLR.

[8]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[9]  Le Song,et al.  Learning to Explain: An Information-Theoretic Perspective on Model Interpretation , 2018, ICML.

[10]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[11]  Alan L. Yuille,et al.  Mitigating adversarial effects through randomization , 2017, ICLR.

[12]  Xiaowei Huang,et al.  Test Metrics for Recurrent Neural Networks , 2019, ArXiv.

[13]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  J Hayhurst Kelly,et al.  A Practical Tutorial on Modified Condition/Decision Coverage , 2001 .

[15]  Xiaowei Huang,et al.  Reachability Analysis of Deep Neural Networks with Provable Guarantees , 2018, IJCAI.

[16]  H. Tsukimoto,et al.  Rule extraction from neural networks via decision tree induction , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[17]  Saibal Mukhopadhyay,et al.  Cascade Adversarial Machine Learning Regularized with a Unified Embedding , 2017, ICLR.

[18]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[19]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Isay Katsman,et al.  Generative Adversarial Perturbations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Hong Liu,et al.  Universal Perturbation Attack Against Image Retrieval , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Thomas Brox,et al.  Inverting Convolutional Networks with Convolutional Networks , 2015, ArXiv.

[23]  Eneldo Loza Mencía,et al.  DeepRED - Rule Extraction from Deep Neural Networks , 2016, DS.

[24]  Sarfraz Khurshid,et al.  Symbolic Execution for Deep Neural Networks , 2018, ArXiv.

[25]  Yifan Zhou,et al.  Reliability Validation of Learning Enabled Vehicle Tracking , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[27]  Graham W. Taylor,et al.  Learning Confidence for Out-of-Distribution Detection in Neural Networks , 2018, ArXiv.

[28]  Chung-Hao Huang,et al.  Quantitative Projection Coverage for Testing ML-enabled Autonomous Systems , 2018, ATVA.

[29]  Eran Yahav,et al.  Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples , 2017, ICML.

[30]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[31]  Ming-Yu Liu,et al.  Tactics of Adversarial Attack on Deep Reinforcement Learning Agents , 2017, IJCAI.

[32]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[33]  Mark Harman,et al.  An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.

[34]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[35]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[36]  Xiaowei Huang,et al.  Reasoning about Cognitive Trust in Stochastic Multiagent Systems , 2017, AAAI.

[37]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[38]  Thomas Brox,et al.  Universal Adversarial Perturbations Against Semantic Image Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[40]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[41]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[42]  Deborah Silver,et al.  Feature Visualization , 1994, Scientific Visualization.

[43]  Moustapha Cissé,et al.  Countering Adversarial Images using Input Transformations , 2018, ICLR.

[44]  Matthias Hein,et al.  Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation , 2017, NIPS.

[45]  J. H. Davis,et al.  An Integrative Model Of Organizational Trust , 1995 .

[46]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Kamyar Azizzadenesheli,et al.  Stochastic Activation Pruning for Robust Adversarial Defense , 2018, ICLR.

[48]  Yue Zhao,et al.  DLFuzz: differential fuzzing testing of deep learning systems , 2018, ESEC/SIGSOFT FSE.

[49]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[50]  Pushmeet Kohli,et al.  A Dual Approach to Scalable Verification of Deep Networks , 2018, UAI.

[51]  Michael E. Houle,et al.  Local Intrinsic Dimensionality I: An Extreme-Value-Theoretic Foundation for Similarity Applications , 2017, SISAP.

[52]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[53]  Sudipta Chattopadhyay,et al.  Automated Directed Fairness Testing , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[54]  Helmut Veith,et al.  Counterexample-guided abstraction refinement for symbolic model checking , 2003, JACM.

[55]  Chung-Hao Huang,et al.  nn-dependability-kit: Engineering Neural Networks for Safety-Critical Systems , 2018, ArXiv.

[56]  David A. Forsyth,et al.  NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles , 2017, ArXiv.

[57]  Farinaz Koushanfar,et al.  Universal Adversarial Perturbations for Speech Recognition Systems , 2019, INTERSPEECH.

[58]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[59]  Alessio Lomuscio,et al.  An approach to reachability analysis for feed-forward ReLU neural networks , 2017, ArXiv.

[60]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[61]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[62]  Nina Narodytska,et al.  Formal Analysis of Deep Binarized Neural Networks , 2018, IJCAI.

[63]  Francisco Herrera,et al.  A unifying view on dataset shift in classification , 2012, Pattern Recognit..

[64]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[65]  Xiaowei Huang,et al.  testRNN: Coverage-guided Testing on Recurrent Neural Networks , 2019, ArXiv.

[66]  John C. Duchi,et al.  Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[67]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[68]  Hang Su,et al.  Sparse Adversarial Perturbations for Videos , 2018, AAAI.

[69]  Lei Ma,et al.  DeepGauge: Comprehensive and Multi-Granularity Testing Criteria for Gauging the Robustness of Deep Learning Systems , 2018, ArXiv.

[70]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[71]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[72]  Kevin Gimpel,et al.  Early Methods for Detecting Adversarial Images , 2016, ICLR.

[73]  Matthew Wicker,et al.  Feature-Guided Black-Box Safety Testing of Deep Neural Networks , 2017, TACAS.

[74]  Ramprasaath R. Selvaraju,et al.  Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization , 2016 .

[75]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[76]  Yannic Noller,et al.  HyDiff: Hybrid Differential Software Analysis , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[77]  Klaus-Robert Müller,et al.  Explainable artificial intelligence , 2017 .

[78]  Ryen W. White Opportunities and challenges in search interaction , 2018, Commun. ACM.

[79]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[80]  Alberto L. Sangiovanni-Vincentelli,et al.  Systematic Testing of Convolutional Neural Networks for Autonomous Driving , 2017, ArXiv.

[81]  Sven Gowal,et al.  Scalable Verified Training for Provably Robust Image Classification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[82]  Percy Liang,et al.  Certified Defenses for Data Poisoning Attacks , 2017, NIPS.

[83]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[84]  Yifan Zhou,et al.  Formal Verification of Robustness and Resilience of Learning-Enabled State Estimation Systems for Robotics , 2020, ArXiv.

[85]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[86]  Zhendong Su,et al.  A Survey on Data-Flow Testing , 2017, ACM Comput. Surv..

[87]  Max Welling,et al.  Visualizing Deep Neural Network Decisions: Prediction Difference Analysis , 2017, ICLR.

[88]  R. Venkatesh Babu,et al.  Generalizable Data-Free Objective for Crafting Universal Adversarial Perturbations , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[89]  Pushmeet Kohli,et al.  Piecewise Linear Neural Network verification: A comparative study , 2017, ArXiv.

[90]  Cengiz Öztireli,et al.  Towards better understanding of gradient-based attribution methods for Deep Neural Networks , 2017, ICLR.

[91]  Yvan Saeys,et al.  Lower bounds on the robustness to adversarial perturbations , 2017, NIPS.

[92]  Chih-Hong Cheng,et al.  Maximum Resilience of Artificial Neural Networks , 2017, ATVA.

[93]  Daniel Kroening,et al.  Global Robustness Evaluation of Deep Neural Networks with Provable Guarantees for the Hamming Distance , 2019, IJCAI.

[94]  Junfeng Yang,et al.  Towards Practical Verification of Machine Learning: The Case of Computer Vision Systems , 2017, ArXiv.

[95]  Sandy H. Huang,et al.  Adversarial Attacks on Neural Network Policies , 2017, ICLR.

[96]  Seong Joon Oh,et al.  Sequential Attacks on Agents for Long-Term Adversarial Goals , 2018, ArXiv.

[97]  Luca Pulina,et al.  An Abstraction-Refinement Approach to Verification of Artificial Neural Networks , 2010, CAV.

[98]  Bernhard C. Geiger,et al.  How (Not) To Train Your Neural Network Using the Information Bottleneck Principle , 2018, ArXiv.

[99]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[100]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[101]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[102]  George Danezis,et al.  Learning Universal Adversarial Perturbations with Generative Models , 2017, 2018 IEEE Security and Privacy Workshops (SPW).

[103]  Jian Pei,et al.  Exact and Consistent Interpretation for Piecewise Linear Neural Networks: A Closed Form Solution , 2018, KDD.

[104]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[105]  C. Lee Giles,et al.  Extraction of rules from discrete-time recurrent neural networks , 1996, Neural Networks.

[106]  Junfeng Yang,et al.  DeepXplore: Automated Whitebox Testing of Deep Learning Systems , 2017, SOSP.

[107]  Jianjun Zhao,et al.  DeepStellar: model-based quantitative analysis of stateful deep learning systems , 2019, ESEC/SIGSOFT FSE.

[108]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[109]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[110]  Naftali Tishby,et al.  Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.

[111]  Joseph Sifakis,et al.  Autonomous Systems - An Architectural Characterization , 2018, Models, Languages, and Tools for Concurrent and Distributed Programming.

[112]  Agustí Verde Parera,et al.  General data protection regulation , 2018 .

[113]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[114]  Christian Gagné,et al.  Controlling Over-generalization and its Effect on Adversarial Examples Generation and Detection , 2018, ArXiv.

[115]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[116]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[117]  Andrew L. Beam,et al.  Adversarial Attacks Against Medical Deep Learning Systems , 2018, ArXiv.

[118]  Paul Voosen,et al.  How AI detectives are cracking open the black box of deep learning , 2017 .

[119]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[120]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[121]  John C. Duchi,et al.  Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.

[122]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[123]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[124]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[125]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[126]  Diptikalyan Saha,et al.  Automated Test Generation to Detect Individual Discrimination in AI Models , 2018, ArXiv.

[127]  Min Wu,et al.  Safety Verification of Deep Neural Networks , 2016, CAV.

[128]  Chung-Hao Huang,et al.  Towards Dependability Metrics for Neural Networks , 2018, 2018 16th ACM/IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE).

[129]  lawa Kanas,et al.  Metric Spaces , 2020, An Introduction to Functional Analysis.

[130]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[131]  Jun Wan,et al.  MuNN: Mutation Analysis of Neural Networks , 2018, 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C).

[132]  Rick Salay,et al.  Using Machine Learning Safely in Automotive Software: An Assessment and Adaption of Software Process Requirements in ISO 26262 , 2018, ArXiv.

[133]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[134]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[135]  Ian Goodfellow,et al.  TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing , 2018, ICML.

[136]  Colin Raffel,et al.  Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[137]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[138]  David A. Wagner,et al.  MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples , 2017, ArXiv.

[139]  Hong Zhu,et al.  Software unit test coverage and adequacy , 1997, ACM Comput. Surv..

[140]  Yarin Gal,et al.  Real Time Image Saliency for Black Box Classifiers , 2017, NIPS.

[141]  Liqian Chen,et al.  Analyzing Deep Neural Networks with Symbolic Propagation: Towards Higher Precision and Faster Verification , 2019, SAS.

[142]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[143]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[144]  R. Srikant,et al.  Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[145]  Matthew Hill,et al.  "Boxing Clever": Practical Techniques for Gaining Insights into Training Data and Monitoring Distribution Shift , 2018, SAFECOMP Workshops.

[146]  Aleksander Madry,et al.  A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations , 2017, ArXiv.

[147]  Hoyt Lougee,et al.  SOFTWARE CONSIDERATIONS IN AIRBORNE SYSTEMS AND EQUIPMENT CERTIFICATION , 2001 .

[148]  Vittoria Bruni,et al.  An entropy based approach for SSIM speed up , 2017, Signal Process..

[149]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[150]  Min Wu,et al.  A Game-Based Approximate Verification of Deep Neural Networks with Provable Guarantees , 2018, Theor. Comput. Sci..

[151]  A. Jefferson Offutt,et al.  Introduction to Software Testing , 2008 .

[152]  Aditi Raghunathan,et al.  Certified Defenses against Adversarial Examples , 2018, ICLR.

[153]  Ian J. Goodfellow,et al.  Technical Report on the CleverHans v2.1.0 Adversarial Examples Library , 2016 .

[154]  Prateek Mittal,et al.  Dimensionality Reduction as a Defense against Evasion Attacks on Machine Learning Classifiers , 2017, ArXiv.

[155]  Cho-Jui Hsieh,et al.  Efficient Neural Network Robustness Certification with General Activation Functions , 2018, NeurIPS.

[156]  David Flynn,et al.  A Safety Framework for Critical Systems Utilising Deep Neural Networks , 2020, SAFECOMP.

[157]  Daniel Kroening,et al.  Concolic Testing for Deep Neural Networks , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[158]  Sarfraz Khurshid,et al.  DeepRoad: GAN-based Metamorphic Autonomous Driving System Testing , 2018, ArXiv.

[159]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[160]  Leonid Ryzhyk,et al.  Verifying Properties of Binarized Deep Neural Networks , 2017, AAAI.

[161]  Matthew Mirman,et al.  Differentiable Abstract Interpretation for Provably Robust Neural Networks , 2018, ICML.

[162]  Weiming Xiang,et al.  Output Reachable Set Estimation and Verification for Multilayer Neural Networks , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[163]  Inderjit S. Dhillon,et al.  Towards Fast Computation of Certified Robustness for ReLU Networks , 2018, ICML.

[164]  Jan Hendrik Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[165]  Cho-Jui Hsieh,et al.  RecurJac: An Efficient Recursive Algorithm for Bounding Jacobian Matrix of Neural Networks and Its Applications , 2018, AAAI.

[166]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[167]  Shin Yoo,et al.  Guiding Deep Learning System Testing Using Surprise Adequacy , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[168]  Lei Ma,et al.  DeepMutation: Mutation Testing of Deep Learning Systems , 2018, 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE).

[169]  Matthew P. Wand,et al.  Kernel Smoothing , 1995 .

[170]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[171]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[172]  Xiaoxing Ma,et al.  Manifesting Bugs in Machine Learning Code: An Explorative Study with Mutation Testing , 2018, 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS).

[173]  Lei Ma,et al.  DeepHunter: Hunting Deep Neural Network Defects via Coverage-Guided Fuzzing , 2018, 1809.01266.

[174]  Edmund M. Clarke,et al.  Counterexample-guided abstraction refinement , 2003, 10th International Symposium on Temporal Representation and Reasoning, 2003 and Fourth International Conference on Temporal Logic. Proceedings..

[175]  Ö. Özer,et al.  Trust and Trustworthiness , 2017, The Handbook of Behavioral Operations.

[176]  Ian J. Goodfellow Gradient Masking Causes CLEVER to Overestimate Adversarial Perturbation Size , 2018, ArXiv.

[177]  Daniel Kroening,et al.  Testing Deep Neural Networks , 2018, ArXiv.

[178]  Hao Chen,et al.  MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[179]  Daniel Kroening,et al.  Structural Test Coverage Criteria for Deep Neural Networks , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[180]  Dejing Dou,et al.  HotFlip: White-Box Adversarial Examples for Text Classification , 2017, ACL.

[181]  Yadong Wang,et al.  Combinatorial Testing for Deep Learning Systems , 2018, ArXiv.

[182]  Arnold Neumaier,et al.  Safe bounds in linear and mixed-integer linear programming , 2004, Math. Program..

[183]  Girish Chowdhary,et al.  Robust Deep Reinforcement Learning with Adversarial Attacks , 2017, AAMAS.

[184]  Natalie Wolchover,et al.  New Theory Cracks Open the Black Box of Deep Learning , 2017 .

[185]  Enrico Bertini,et al.  Interpreting Black-Box Classifiers Using Instance-Level Visual Explanations , 2017, HILDA@SIGMOD.

[186]  Stephan Merz,et al.  Model Checking , 2000 .

[187]  Jinfeng Yi,et al.  Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach , 2018, ICLR.

[188]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[189]  Ashish Tiwari,et al.  Output Range Analysis for Deep Neural Networks , 2017, ArXiv.

[190]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[191]  Roderick Bloem,et al.  Repair with On-The-Fly Program Analysis , 2012, Haifa Verification Conference.

[192]  Daniel Kroening,et al.  Structural Test Coverage Criteria for Deep Neural Networks , 2019, ACM Trans. Embed. Comput. Syst..

[193]  Rüdiger Ehlers,et al.  Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks , 2017, ATVA.

[194]  Daniel Kroening,et al.  DeepConcolic: Testing and Debugging Deep Neural Networks , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[195]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[196]  Brian Kingsbury,et al.  Estimating Information Flow in Neural Networks , 2018, ArXiv.

[197]  James Bailey,et al.  Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[198]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[199]  John Rushby,et al.  The Interpretation and Evaluation of Assurance Cases , 2015 .

[200]  Junfeng Yang,et al.  Formal Security Analysis of Neural Networks using Symbolic Intervals , 2018, USENIX Security Symposium.

[201]  Antonio Criminisi,et al.  Measuring Neural Net Robustness with Constraints , 2016, NIPS.

[202]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.