论文信息 - Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?

Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?

Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment, typically leading to arbitrary deductions and poorly-informed decisions. In principle, detection of and adaptation to OOD scenes can mitigate their adverse effects. In this paper, we highlight the limitations of current approaches to novel driving scenes and propose an epistemic uncertainty-aware planning method, called \emph{robust imitative planning} (RIP). Our method can detect and recover from some distribution shifts, reducing the overconfident and catastrophic extrapolations in OOD scenes. If the model's uncertainty is too great to suggest a safe course of action, the model can instead query the expert driver for feedback, enabling sample-efficient online adaptation, a variant of our method we term \emph{adaptive robust imitative planning} (AdaRIP). Our methods outperform current state-of-the-art approaches in the nuScenes \emph{prediction} challenge, but since no benchmark evaluating OOD detection and adaption currently exists to assess \emph{control}, we introduce an autonomous car novel-scene benchmark, \texttt{CARNOVEL}, to evaluate the robustness of driving agents to a suite of tasks with distribution shifts.

[1] A. Wald. Contributions to the Theory of Statistical Estimation and Testing Hypotheses , 1939 .

[2] Eder Santana,et al. Exploring the Limitations of Behavior Cloning for Autonomous Driving , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3] Kyunghyun Cho,et al. Query-Efficient Imitation Learning for End-to-End Autonomous Driving , 2016, ArXiv.

[4] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[5] Sergey Levine,et al. Deep Imitative Models for Flexible Inference, Planning, and Control , 2018, ICLR.

[6] OpenAI. Learning Dexterous In-Hand Manipulation. , 2018 .

[7] Vladlen Koltun,et al. Learning by Cheating , 2019, CoRL.

[8] Richard Bellman,et al. Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[9] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[10] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .

[11] Germán Ros,et al. CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[12] Qiang Xu,et al. nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Dragomir Anguelov,et al. Scalability in Perception for Autonomous Driving: An Open Dataset Benchmark , 2019 .

[14] Andreas Geiger,et al. Conditional Affordance Learning for Driving in Urban Environments , 2018, CoRL.

[15] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .

[16] Yarin Gal,et al. Generalizing from a few environments in safety-critical reinforcement learning , 2019, ArXiv.

[17] Sergey Levine,et al. Watch, Try, Learn: Meta-Learning from Demonstrations and Reward , 2019, ICLR.

[18] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .

[19] Laurent Orseau,et al. AI Safety Gridworlds , 2017, ArXiv.

[20] Marcin Andrychowicz,et al. Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[21] Doina Precup,et al. Using Bisimulation for Policy Transfer in MDPs , 2010, AAAI.

[22] Igor Mordatch,et al. Model Based Planning with Energy Based Models , 2019, CoRL.

[23] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[24] J. Corcoran. Modelling Extremal Events for Insurance and Finance , 2002 .

[25] Christopher G. Atkeson,et al. Standing balance control using a trajectory library , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26] Paul Vernaza,et al. r2p2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting , 2018, ECCV.

[27] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[28] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[29] G Coley,et al. Driver reaction times to familiar but unexpected events , 2010 .

[30] Alexey Dosovitskiy,et al. End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[31] Shigeki Sugano,et al. Rethinking Self-driving: Multi-task Knowledge for Better Generalization and Accident Explanation Ability , 2018, ArXiv.

[32] Wassim G. Najm,et al. Pre-Crash Scenario Typology for Crash Avoidance Research , 2007 .

[33] Sergey Levine,et al. PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[35] Benjamin Sapp,et al. MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction , 2019, CoRL.

[36] Sergey Levine,et al. Uncertainty-Aware Reinforcement Learning for Collision Avoidance , 2017, ArXiv.

[37] Hugo Larochelle,et al. Neural Autoregressive Distribution Estimation , 2016, J. Mach. Learn. Res..

[38] Sergey Levine,et al. Causal Confusion in Imitation Learning , 2019, NeurIPS.

[39] Sebastian Nowozin,et al. Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.

[40] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.

[41] Eric P. Xing,et al. CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving , 2018, ECCV.

[42] Balaraman Ravindran,et al. EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.

[43] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[44] Motoaki Kawanabe,et al. Machine Learning in Non-Stationary Environments - Introduction to Covariate Shift Adaptation , 2012, Adaptive computation and machine learning.

[45] C. Bishop. Mixture density networks , 1994 .

[46] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[47] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.

[48] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[49] G T Taoka. BREAK REACTION TIMES OF UNALERTED DRIVERS , 1989 .

[50] Yarin Gal,et al. Uncertainty in Deep Learning , 2016 .

[51] Dragomir Anguelov,et al. Scalability in Perception for Autonomous Driving: Waymo Open Dataset , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52] Sergey Levine,et al. (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[53] G. Taoka. Brake Reaction Times ofUnalerted Drivers , 1997 .

[54] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.

[55] Henggang Cui,et al. Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[56] Shakir Mohamed,et al. Variational Inference with Normalizing Flows , 2015, ICML.

[57] David Barber,et al. Bayesian reasoning and machine learning , 2012 .

[58] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[59] Elena Corina Grigore,et al. CoverNet: Multimodal Behavior Prediction Using Trajectory Sets , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[60] Marta Z. Kwiatkowska,et al. Evaluating Uncertainty Quantification in End-to-End Autonomous Driving Control , 2018, ArXiv.

[61] Ryan P. Adams,et al. Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[62] Sergey Levine,et al. Robustness to Out-of-Distribution Inputs via Task-Aware Generative Uncertainty , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[63] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.

[64] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[65] Ruslan Salakhutdinov,et al. Worst Cases Policy Gradients , 2019, CoRL.

[66] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).