论文信息 - Improved Deep Reinforcement Learning with Expert Demonstrations for Urban Autonomous Driving

Improved Deep Reinforcement Learning with Expert Demonstrations for Urban Autonomous Driving

Currently, urban autonomous driving remains challenging because of the complexity of the driving environment. Learning-based approaches, such as reinforcement learning (RL) and imitation learning (IL), have indicated superiority over rulebased approaches, showing great potential to make decisions intelligently, but they still do not work well in urban driving situations. To better tackle this problem, this paper proposes a novel learning-based method that combines deep reinforcement learning with expert demonstrations, focusing on longitudinal motion control in autonomous driving. Our proposed method employs the soft actor-critic structure and modifies the learning process of the policy network to incorporate both the goals of maximizing reward and imitating the expert. Moreover, an adaptive prioritized experience replay is designed to sample experience from both the agent’s self-exploration and expert demonstration, in order to improve the sample efficiency. The proposed method is validated in a simulated urban roundabout scenario and compared with various prevailing RL and IL baseline approaches. The results manifest that the proposed method has a faster training speed, as well as better performance in navigating safely and time-efficiently. The ablation study reveals that the prioritized replay and expert demonstration filter play important roles in our proposed method.

Chen Lv | Zhiyu Huang | Haochen Liu

[1] Moulay A. Akhloufi,et al. Learning to Drive by Imitation: An Overview of Deep Behavior Cloning Methods , 2021, IEEE Transactions on Intelligent Vehicles.

[2] Yi Zhang,et al. Human-like Autonomous Vehicle Speed Control by Deep Reinforcement Learning with Double Q-Learning , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[3] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.

[4] Alexey Dosovitskiy,et al. End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[5] Chung Choo Chung,et al. Autonomous braking system via deep reinforcement learning , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[6] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[7] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[8] N. Kemal Ure,et al. Enhancing Situational Awareness and Performance of Adaptive Cruise Control through Model Predictive Control and Deep Reinforcement Learning , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[9] Sergey Levine,et al. Deep Imitative Models for Flexible Inference, Planning, and Control , 2018, ICLR.

[10] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.

[11] Ellen van Nunen,et al. Cooperative Competition for Future Mobility , 2012, IEEE Transactions on Intelligent Transportation Systems.

[12] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .

[13] Mykel J. Kochenderfer,et al. Imitating driver behavior with generative adversarial networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[14] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.

[15] Jingda Wu,et al. Multi-Modal Sensor Fusion-Based Deep Neural Network for End-to-End Autonomous Driving With Scene Understanding , 2020, IEEE Sensors Journal.

[16] Omar Y. Al-Jarrah,et al. Deep Learning-based Vehicle Behaviour Prediction For Autonomous Driving Applications: A Review , 2019, ArXiv.

[17] Sebastian Thrun,et al. Towards fully autonomous driving: Systems and algorithms , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[18] Emilio Frazzoli,et al. A Survey of Motion Planning and Control Techniques for Self-Driving Urban Vehicles , 2016, IEEE Transactions on Intelligent Vehicles.

[19] Mikael Henaff,et al. Disagreement-Regularized Imitation Learning , 2020, ICLR.

[20] Szil'ard Aradi,et al. Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles , 2020, IEEE Transactions on Intelligent Transportation Systems.

[21] Hussein Dia,et al. Comparative evaluation of microscopic car-following behavior , 2005, IEEE Transactions on Intelligent Transportation Systems.

[22] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[23] Eric P. Xing,et al. CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving , 2018, ECCV.

[24] Yang Gao,et al. End-to-End Learning of Driving Models from Large-Scale Video Datasets , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Yanjie Li,et al. A Deep Reinforcement Learning Algorithm with Expert Demonstrations and Supervised Loss and its application in Autonomous Driving , 2018, 2018 37th Chinese Control Conference (CCC).

[26] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[27] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[28] Long Chen,et al. A Reinforcement Learning-Based Adaptive Path Tracking Approach for Autonomous Driving , 2020, IEEE Transactions on Vehicular Technology.

[29] Germán Ros,et al. CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[30] Masayoshi Tomizuka,et al. Model-free Deep Reinforcement Learning for Urban Autonomous Driving , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[31] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.

[32] Long Chen,et al. An Adaptive Path Tracking Controller Based on Reinforcement Learning with Urban Driving Application , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[33] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[34] Anca D. Dragan,et al. SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards , 2019, ICLR.