Efficient statistical validation of machine learning systems for autonomous driving

Today's automotive industry is making a bold move to equip vehicles with intelligent driver assistance features. A modern automobile is now equipped with a powerful computing platform to run multiple machine learning algorithms for environment perception (e.g., pedestrian detection) and motion control (e.g., vehicle stabilization). These machine learning systems must be highly robust with extremely small failure rate in order to ensure safe and reliable driving. In this paper, we propose a novel Subset Sampling (SUS) algorithm to efficiently validate a machine learning system. In particular, a Markov Chain Monte Carlo algorithm based on graph mapping is developed to accurately estimate the rare failure rate with a minimal amount of test data, thereby minimizing the validation cost. Our numerical experiments show that SUS achieves 15.2× runtime speed-up over the conventional brute-force Monte Carlo method.

[1]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[2]  Jian Fu,et al.  MGSim - Simulation tools for multi-core processor architectures , 2013, ArXiv.

[3]  Julius Ziegler,et al.  Making Bertha Drive—An Autonomous Journey on a Historic Route , 2014, IEEE Intelligent Transportation Systems Magazine.

[4]  Anders Lindgren,et al.  State of the Art Analysis: An Overview of Advanced Driver Assistance Systems (ADAS) and Possible Human Factors Issues , 2006 .

[5]  Alberto Broggi,et al.  Extensive Tests of Autonomous Driving Technologies , 2013, IEEE Transactions on Intelligent Transportation Systems.

[6]  Luc Van Gool,et al.  Traffic sign recognition — How far are we from the solution? , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[7]  Frédéric Pétrot,et al.  Facing ADAS validation complexity with usage oriented testing , 2016, ArXiv.

[8]  Xin Li,et al.  Fast statistical analysis of rare circuit failure events via subset simulation in high-dimensional variation space , 2014, 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[9]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[10]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[11]  Meng-Yin Fu,et al.  A survey of traffic sign recognition , 2010, 2010 International Conference on Wavelet Analysis and Pattern Recognition.

[12]  Sw. Banerjee,et al.  Linear Algebra and Matrix Analysis for Statistics , 2014 .

[13]  Teofilo F. Gonzalez,et al.  P-Complete Approximation Problems , 1976, J. ACM.

[14]  Lawrence T. Pileggi,et al.  Architecture-aware FPGA placement using metric embedding , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[15]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[16]  Johannes Stallkamp,et al.  Detection of traffic signs in real-world images: The German traffic sign detection benchmark , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[17]  J. Beck,et al.  Estimation of Small Failure Probabilities in High Dimensions by Subset Simulation , 2001 .

[18]  Marc Stamminger,et al.  Hardware-in-the-loop testing of computer vision based driver assistance systems , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[19]  David Harel,et al.  Graph Drawing by High-Dimensional Embedding , 2002, J. Graph Algorithms Appl..

[20]  Teofilo F. GONZALEZ,et al.  Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[21]  Werner Huber,et al.  Experience, Results and Lessons Learned from Automated Driving on Germany's Highways , 2015, IEEE Intelligent Transportation Systems Magazine.

[22]  Johannes Stallkamp,et al.  Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition , 2012, Neural Networks.