The Utility of Neural Network Test Coverage Measures

In this position paper, we are interested in what test coverage measures can, and cannot, tell us about neural networks. We begin with a review of the role of test coverage measures in traditional development approaches for safety-related software. We show how those coverage measures, in the neural network sense, cannot achieve the same aims as their equivalents in the traditional sense. We provide indications of approaches that can partially meet those aims. We also indicate the utility of current neural network coverage measures.

[1]  Hoyt Lougee,et al.  SOFTWARE CONSIDERATIONS IN AIRBORNE SYSTEMS AND EQUIPMENT CERTIFICATION , 2001 .

[2]  Daniel Kroening,et al.  Structural Test Coverage Criteria for Deep Neural Networks , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[3]  Mykel J. Kochenderfer,et al.  The Marabou Framework for Verification and Analysis of Deep Neural Networks , 2019, CAV.

[4]  Radu Calinescu,et al.  Assuring the Machine Learning Lifecycle , 2019, ACM Comput. Surv..

[5]  Junfeng Yang,et al.  DeepXplore: Automated Whitebox Testing of Deep Learning Systems , 2017, SOSP.

[6]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[7]  Andrew Rae,et al.  Situation coverage – a coverage criterion for testing autonomous robots , 2015 .

[8]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[9]  Yu Lei,et al.  Practical Combinatorial Testing , 2010 .

[10]  Lei Ma,et al.  DeepGauge: Comprehensive and Multi-Granularity Testing Criteria for Gauging the Robustness of Deep Learning Systems , 2018, ArXiv.

[11]  Gregory Gay,et al.  The Risks of Coverage-Directed Test Case Generation , 2015, IEEE Transactions on Software Engineering.

[12]  Rick Salay,et al.  Using Machine Learning Safely in Automotive Software: An Assessment and Adaption of Software Process Requirements in ISO 26262 , 2018, ArXiv.

[13]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[14]  Mykel J. Kochenderfer,et al.  Policy compression for aircraft collision avoidance systems , 2016, 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC).

[15]  Min Wu,et al.  Safety Verification of Deep Neural Networks , 2016, CAV.

[16]  Patrick Graydon,et al.  Planning the Unplanned Experiment: Assessing the Efficacy of Standards for Safety Critical Software , 2015 .

[17]  Vineeth N. Balasubramanian,et al.  Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18]  Ben Y. Zhao,et al.  Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[19]  C. Michael Holloway Making the Implicit Explicit: Towards an Assurance Case for DO-178C , 2013 .

[20]  Timon Gehr,et al.  An abstract domain for certifying neural networks , 2019, Proc. ACM Program. Lang..

[21]  Lei Ma,et al.  DeepCT: Tomographic Combinatorial Testing for Deep Learning Systems , 2019, 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[22]  J Hayhurst Kelly,et al.  A Practical Tutorial on Modified Condition/Decision Coverage , 2001 .

[23]  Mislav Balunovic,et al.  DL2: Training and Querying Neural Networks with Logic , 2019, ICML.

[24]  Jeff Yu Lei,et al.  Practical Combinatorial Testing: Beyond Pairwise , 2008, IT Professional.

[25]  R. Hawkins,et al.  The Principles of Software Safety Assurance , 2022 .

[26]  C. Pasareanu,et al.  Property Inference for Deep Neural Networks , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[27]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[28]  Rob Ashmore,et al.  Requirements Assurance in Machine Learning , 2019, SafeAI@AAAI.

[29]  Monique Kardos,et al.  A Simple Handbook for Non-Traditional Red Teaming , 2017 .

[30]  Thomas J. Santner,et al.  Space-Filling Designs for Computer Experiments , 2003 .