Open Questions in Testing of Learned Computer Vision Functions for Automated Driving

Vision is an important sensing modality in automated driving. Deep learning-based approaches have gained popularity for different computer vision (CV) tasks such as semantic segmentation and object detection. However, the black-box nature of deep neural nets (DNN) is a challenge for practical software verification. With this paper, we want to initiate a discussion in the academic community about research questions w.r.t. software testing of DNNs for safety-critical CV tasks. To this end, we provide an overview of related work from various domains, including software testing, machine learning and computer vision and derive a set of open research questions to start discussion between the fields.

[1]  Pushmeet Kohli,et al.  Verification of Non-Linear Specifications for Neural Networks , 2019, ICLR.

[2]  Daniel Cremers,et al.  What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? , 2018, International Journal of Computer Vision.

[3]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[4]  Xiaolin Hu,et al.  UnrealStereo: Controlling Hazardous Factors to Analyze Stereo Vision , 2016, 2018 International Conference on 3D Vision (3DV).

[5]  Christian Müller,et al.  Toward a Methodology for Training with Synthetic Data on the Example of Pedestrian Detection in a Frame-by-Frame Semantic Segmentation Task , 2018, 2018 IEEE/ACM 1st International Workshop on Software Engineering for AI in Autonomous Systems (SEFAIAS).

[6]  Philip Koopman,et al.  How Many Operational Design Domains, Objects, and Events? , 2019, SafeAI@AAAI.

[7]  Christopher Ré Software 2.0 and Snorkel: Beyond Hand-Labeled Data , 2018, KDD.

[8]  Sanjit A. Seshia,et al.  Formal Specification for Deep Neural Networks , 2018, ATVA.

[9]  Philip Koopman,et al.  Putting Image Manipulations in Context: Robustness Testing for Safe Perception , 2018, 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).

[10]  Mykel J. Kochenderfer,et al.  Algorithms for Verifying Deep Neural Networks , 2019, Found. Trends Optim..

[11]  Thomas G. Dietterich,et al.  Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.

[12]  Timon Gehr,et al.  An abstract domain for certifying neural networks , 2019, Proc. ACM Program. Lang..

[13]  Sarfraz Khurshid,et al.  DeepRoad: GAN-Based Metamorphic Testing and Input Validation Framework for Autonomous Driving Systems , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[14]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[15]  Junfeng Yang,et al.  DeepXplore: Automated Whitebox Testing of Deep Learning Systems , 2017, SOSP.

[16]  Baowen Xu,et al.  Testing and validating machine learning classifiers by metamorphic testing , 2011, J. Syst. Softw..

[17]  Foutse Khomh,et al.  On Testing Machine Learning Programs , 2018, J. Syst. Softw..

[18]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Daniel Kroening,et al.  Concolic Testing for Deep Neural Networks , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[20]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[21]  Mark Harman,et al.  The Oracle Problem in Software Testing: A Survey , 2015, IEEE Transactions on Software Engineering.

[22]  Oliver Zendel,et al.  WildDash - Creating Hazard-Aware Benchmarks , 2018, ECCV.

[23]  Oliver Zendel,et al.  CV-HAZOP: Introducing Test Data Validation for Computer Vision , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Benoît Frénay,et al.  A comprehensive introduction to label noise , 2014, ESANN.

[26]  Philip Koopman,et al.  Toward a Framework for Highly Automated Vehicle Safety Validation , 2018 .

[27]  Andreas Geiger,et al.  Augmented Reality Meets Computer Vision: Efficient Data Generation for Urban Driving Scenes , 2017, International Journal of Computer Vision.

[28]  Bernt Schiele,et al.  Not Using the Car to See the Sidewalk — Quantifying and Controlling the Effects of Context in Classification and Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Matthew Johnson-Roberson,et al.  Sensor Transfer: Learning Optimal Sensor Effect Image Augmentation for Sim-to-Real Domain Adaptation , 2018, IEEE Robotics and Automation Letters.

[30]  Koushik Sen,et al.  CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.

[31]  Markus Borg,et al.  Safely Entering the Deep: A Review of Verification and Validation for Machine Learning and a Challenge Elicitation in the Automotive Industry , 2018, Journal of Automotive Software Engineering.