Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures

This paper addresses the problem of evaluating learning systems in safety critical domains such as autonomous driving, where failures can have catastrophic consequences. We focus on two problems: searching for scenarios when learned agents fail and assessing their probability of failure. The standard method for agent evaluation in reinforcement learning, Vanilla Monte Carlo, can miss failures entirely, leading to the deployment of unsafe agents. We demonstrate this is an issue for current agents, where even matching the compute used for training is sometimes insufficient for evaluation. To address this shortcoming, we draw upon the rare event probability estimation literature and propose an adversarial evaluation approach. Our approach focuses evaluation on adversarially chosen situations, while still providing unbiased estimates of failure probabilities. The key difficulty is in identifying these adversarial situations -- since failures are rare there is little signal to drive optimization. To solve this we propose a continuation approach that learns failure modes in related but less robust agents. Our approach also allows reuse of data already collected for training the agent. We demonstrate the efficacy of adversarial evaluation on two standard domains: humanoid control and simulated driving. Experimental results show that our methods can find catastrophic failures and estimate failures rates of agents multiple orders of magnitude faster than standard evaluation schemes, in minutes to hours rather than days.

[1]  S. Turnovsky Optimal Stabilization Policies for Stochastic Linear Systems: The Case of Correlated Multiplicative and Additive Disturbances , 1976 .

[2]  Reuven Y. Rubinstein,et al.  Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.

[3]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[4]  Reuven Y. Rubinstein,et al.  Optimization of computer simulation models with rare events , 1997 .

[5]  E. Altman Constrained Markov Decision Processes , 1999 .

[6]  Christos Dimitrakakis,et al.  TORCS, The Open Racing Car Simulator , 2005 .

[7]  Dirk P. Kroese,et al.  Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.

[8]  Shie Mannor,et al.  Reinforcement learning in the presence of rare events , 2008, ICML '08.

[9]  Gerardo Rubino,et al.  Rare Event Simulation using Monte Carlo Methods , 2009 .

[10]  Gerardo Rubino,et al.  Introduction to Rare Event Simulation , 2009, Rare Event Simulation using Monte Carlo Methods.

[11]  Jean-Yves Audibert,et al.  Robust linear least squares regression , 2010, 1010.0074.

[12]  P. Abbeel,et al.  LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information , 2011 .

[13]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Pieter Abbeel,et al.  Safe Exploration in Markov Decision Processes , 2012, ICML.

[16]  Wentao Li,et al.  Two-Stage Importance Sampling With Mixture Proposals , 2013 .

[17]  Jaime F. Fisac,et al.  Reachability-based safe learning with Gaussian processes , 2014, 53rd IEEE Conference on Decision and Control.

[18]  Csaba Szepesvári,et al.  Adaptive Monte Carlo via Bandit Allocation , 2014, ICML.

[19]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Marco Pavone,et al.  Monte Carlo Motion Planning for Robot Trajectory Optimization Under Uncertainty , 2015, ISRR.

[22]  G. Lugosi,et al.  Empirical risk minimization for heavy-tailed losses , 2014, 1406.2462.

[23]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[24]  Javier García,et al.  A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..

[25]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[26]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[27]  Martin A. Riedmiller,et al.  Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.

[28]  Ming-Yu Liu,et al.  Tactics of Adversarial Attack on Deep Reinforcement Learning Agents , 2017, IJCAI.

[29]  Silvio Savarese,et al.  Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Demis Hassabis,et al.  Neural Episodic Control , 2017, ICML.

[31]  Junfeng Yang,et al.  DeepXplore: Automated Whitebox Testing of Deep Learning Systems , 2017, SOSP.

[32]  Pieter Abbeel,et al.  Constrained Policy Optimization , 2017, ICML.

[33]  Amnon Shashua,et al.  On a Formal Model of Safe and Scalable Self-driving Cars , 2017, ArXiv.

[34]  Andreas Krause,et al.  Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.

[35]  Max Jaderberg,et al.  Population Based Training of Neural Networks , 2017, ArXiv.

[36]  Aleksander Madry,et al.  A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations , 2017, ArXiv.

[37]  Sanjit A. Seshia,et al.  Compositional Falsification of Cyber-Physical Systems with Machine Learning Components , 2017, NFM.

[38]  Marco Pavone,et al.  Evaluating Trajectory Collision Probability through Adaptive Importance Sampling for Safe Motion Planning , 2016, Robotics: Science and Systems.

[39]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[40]  Sandy H. Huang,et al.  Adversarial Attacks on Neural Network Policies , 2017, ICLR.

[41]  Matthias Bethge,et al.  Robust Perception through Analysis by Synthesis , 2018, ArXiv.

[42]  Nicholas Carlini,et al.  Unrestricted Adversarial Examples , 2018, ArXiv.

[43]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[44]  Ryan P. Adams,et al.  Motivating the Rules of the Game for Adversarial Example Research , 2018, ArXiv.

[45]  David Budden,et al.  Distributed Prioritized Experience Replay , 2018, ICLR.

[46]  Ofir Nachum,et al.  A Lyapunov-based Approach to Safe Reinforcement Learning , 2018, NeurIPS.

[47]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[48]  Yang Song,et al.  Generative Adversarial Examples , 2018, NIPS 2018.

[49]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[50]  Yuval Tassa,et al.  DeepMind Control Suite , 2018, ArXiv.

[51]  Marcin Andrychowicz,et al.  Asymmetric Actor Critic for Image-Based Robot Learning , 2017, Robotics: Science and Systems.

[52]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[53]  Matthew W. Hoffman,et al.  Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.

[54]  Nando de Freitas,et al.  Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.

[55]  Aleksander Madry,et al.  Exploring the Landscape of Spatial Robustness , 2017, ICML.

[56]  Matthias Bethge,et al.  Towards the first adversarially robust neural network model on MNIST , 2018, ICLR.