Multimodal Safety-Critical Scenarios Generation for Decision-Making Algorithms Evaluation

Existing neural network-based autonomous systems are shown to be vulnerable against adversarial attacks, therefore sophisticated evaluation of their robustness is of great importance. However, evaluating the robustness under the worst-case scenarios based on known attacks is not comprehensive, not to mention that some of them even rarely occur in the real world. Also, the distribution of safety-critical data is usually multimodal, while most traditional attacks and evaluation methods focus on a single modality. To solve the above challenges, we propose a flow-based multimodal safety-critical scenario generator for evaluating decision-making algorithms. The proposed generative model is optimized with weighted likelihood maximization and a gradient-based sampling procedure is integrated to improve the sampling efficiency. The safety-critical scenarios are generated by efficiently querying the task algorithms and a simulator. Experiments on a self-driving task demonstrate our advantages in terms of testing efficiency and multimodal modeling capability. We evaluate six Reinforcement Learning algorithms with our generated traffic scenarios and provide empirical conclusions about their robustness.

[1]  Logan Engstrom,et al.  Black-box Adversarial Attacks with Limited Queries and Information , 2018, ICML.

[2]  Wenhao Ding,et al.  CMTS: A Conditional Multiple Trajectory Synthesizer for Generating Safety-Critical Driving Scenarios , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Sergey Levine,et al.  Model-Based Reinforcement Learning for Atari , 2019, ICLR.

[4]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[5]  Mykel J. Kochenderfer,et al.  Critical Factor Graph Situation Clusters for Accelerated Automotive Safety Validation , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[6]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[7]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[8]  Matthias Althoff,et al.  Generating Critical Test Scenarios for Automated Vehicles with Evolutionary Algorithms , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[9]  Russ Tedrake,et al.  Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation , 2018, NeurIPS.

[10]  Aman Sinha,et al.  Efficient Black-box Assessment of Autonomous Vehicle Safety , 2019, ArXiv.

[11]  Mykel J. Kochenderfer,et al.  Adaptive Stress Testing for Autonomous Vehicles , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[12]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[13]  Lutz Eckstein,et al.  The inD Dataset: A Drone Dataset of Naturalistic Road User Trajectories at German Intersections , 2019, 2020 IEEE Intelligent Vehicles Symposium (IV).

[14]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[15]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[16]  Björn Matthias,et al.  Safety of Industrial Robots: From Conventional to Collaborative Applications , 2012, ROBOTIK.

[17]  Richard Bowden,et al.  Training Adversarial Agents to Exploit Weaknesses in Deep Control Policies , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Amir Nazemi,et al.  Potential adversarial samples for white-box attacks , 2019, ArXiv.

[19]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[20]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[21]  Lihong Li,et al.  An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[22]  Hamza Fawzi,et al.  Adversarial vulnerability for any classifier , 2018, NeurIPS.

[23]  Andrew Gordon Wilson,et al.  Simple Black-box Adversarial Attacks , 2019, ICML.

[24]  Max Welling,et al.  Learning Likelihoods with Conditional Normalizing Flows , 2019, ArXiv.

[25]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[26]  Lih-Yuan Deng,et al.  The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning , 2006, Technometrics.

[27]  Gregory Dudek,et al.  Generating Adversarial Driving Scenarios in High-Fidelity Simulators , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[28]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29]  Tom Goldstein,et al.  Are adversarial examples inevitable? , 2018, ICLR.

[30]  Dario Amodei,et al.  Benchmarking Safe Exploration in Deep Reinforcement Learning , 2019 .

[31]  P. Glasserman,et al.  Ibm Research Report Multilevel Splitting for Estimating Rare Event Probabilities Multilevel Splitting for Estimating Rare Event Probabilities , 1996 .

[32]  Ole J. Mengshoel,et al.  Adaptive stress testing of airborne collision avoidance systems , 2015, 2015 IEEE/AIAA 34th Digital Avionics Systems Conference (DASC).

[33]  Manmohan Krishna Chandraker,et al.  Learning To Simulate , 2018, ICLR.

[34]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[35]  Javier García,et al.  A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..

[36]  Kenneth O. Stanley,et al.  Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.

[37]  Wojciech Jaskowski,et al.  Model-Based Active Exploration , 2018, ICML.

[38]  Deepak Pathak,et al.  Self-Supervised Exploration via Disagreement , 2019, ICML.

[39]  Wassim G. Najm,et al.  Pre-Crash Scenario Typology for Crash Avoidance Research , 2007 .

[40]  Nidhi Kalra,et al.  Driving to Safety , 2016 .

[41]  Steven Xiaogang Wang,et al.  Maximum weighted likelihood estimation , 2001 .

[42]  Joelle Pineau,et al.  Benchmarking Batch Deep Reinforcement Learning Algorithms , 2019, ArXiv.

[43]  Filip De Turck,et al.  VIME: Variational Information Maximizing Exploration , 2016, NIPS.

[44]  Wenshuo Wang,et al.  A New Multi-vehicle Trajectory Generator to Simulate Vehicle-to-Vehicle Encounters , 2018 .

[45]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[46]  Mrinal Kalakrishnan,et al.  Learning Probabilistic Multi-Modal Actor Models for Vision-Based Robotic Grasping , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[47]  Wenhao Ding,et al.  Learning to Collide: An Adaptive Safety-Critical Scenarios Generating Method , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[48]  Yang Song,et al.  Class-Balanced Loss Based on Effective Number of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[50]  Tom Schaul,et al.  Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[51]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[52]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[53]  Baiming Chen,et al.  Adversarial Evaluation of Autonomous Vehicles in Lane-Change Scenarios , 2020, ArXiv.