Learning Driving Decisions by Imitating Drivers' Control Behaviors

Classical autonomous driving systems are modularized as a pipeline of perception, decision, planning, and control. The driving decision plays a central role in processing the observation from the perception as well as directing the execution of downstream planning and control modules. Commonly the decision module is designed to be rule-based and is difficult to learn from data. Recently end-to-end neural control policy has been proposed to replace this pipeline, given its generalization ability. However, it remains challenging to enforce physical or logical constraints on the decision to ensure driving safety and stability. In this work, we propose a hybrid framework for learning a decision module, which is agnostic to the mechanisms of perception, planning, and control modules. By imitating the low-level control behavior, it learns the high-level driving decisions while bypasses the ambiguous annotation of high-level driving decisions. We demonstrate that the simulation agents with a learned decision module can be generalized to various complex driving scenarios where the rule-based approach fails. Furthermore, it can generate driving behaviors that are smoother and safer than end-to-end neural policies.

[1]  Jaime F. Fisac,et al.  A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems , 2017, IEEE Transactions on Automatic Control.

[2]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[3]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[4]  Sebastian Thrun,et al.  Junior: The Stanford entry in the Urban Challenge , 2008, J. Field Robotics.

[5]  Chris L. Baker,et al.  Action understanding as inverse planning , 2009, Cognition.

[6]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[7]  Emilio Frazzoli,et al.  A Survey of Motion Planning and Control Techniques for Self-Driving Urban Vehicles , 2016, IEEE Transactions on Intelligent Vehicles.

[8]  Paul Vernaza,et al.  r2p2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting , 2018, ECCV.

[9]  Sanjiv Singh,et al.  The DARPA Urban Challenge: Autonomous Vehicles in City Traffic, George Air Force Base, Victorville, California, USA , 2009, The DARPA Urban Challenge.

[10]  Bernhard Schölkopf,et al.  Probabilistic movement modeling for intention inference in human–robot interaction , 2013, Int. J. Robotics Res..

[11]  Sergey Levine,et al.  Deep Imitative Models for Flexible Inference, Planning, and Control , 2018, ICLR.

[12]  Mykel J. Kochenderfer,et al.  Imitating driver behavior with generative adversarial networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[13]  Eric P. Xing,et al.  CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving , 2018, ECCV.

[14]  Louis B. Rall,et al.  Automatic differentiation , 1981 .

[15]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[16]  Sandy H. Huang,et al.  Adversarial Attacks on Neural Network Policies , 2017, ICLR.

[17]  Xi Chen,et al.  Learning From Demonstration in the Wild , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[18]  Hema Swetha Koppula,et al.  Car that Knows Before You Do: Anticipating Maneuvers via Learning Temporal Driving Models , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Alexey Dosovitskiy,et al.  End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Dean Pomerleau,et al.  Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.

[21]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[22]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[23]  Andrew G. Barto,et al.  Lyapunov Design for Safe Reinforcement Learning , 2003, J. Mach. Learn. Res..

[24]  Jitendra Malik,et al.  Combining Optimal Control and Learning for Visual Navigation in Novel Environments , 2019, CoRL.

[25]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[26]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[27]  Amnon Shashua,et al.  On a Formal Model of Safe and Scalable Self-driving Cars , 2017, ArXiv.

[28]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.