Stabilization Approaches for Reinforcement Learning-Based End-to-End Autonomous Driving

Deep reinforcement learning (DRL) has been successfully applied to end-to-end autonomous driving, especially in simulation environments. However, common DRL approaches used in complex autonomous driving scenarios sometimes are unstable or difficult to converge. This paper proposes two approaches to improve the stability of the policy model training with as few manual data as possible. For the first approach, reinforcement learning is combined with imitation learning to train a feature network with a small amount of manual data for parameters initialization. For the second approach, an auxiliary network is added to the reinforcement learning framework, which can leverage the real-time measurement information to deepen the understanding of environment, without any guide of demonstrators. To verify the effectiveness of these two approaches, simulations in image information-based and lidar information-based end-to-end autonomous driving systems are conducted, respectively. These approaches are not only tested in the virtual game world, but also applied in Gazebo, in which we build a 3D world based on the real vehicle model of Ranger XP900 platform, the real 3D obstacle model, and the real motion constraints with inertial characteristics, so as to ensure that the trained end-to-end autonomous driving model is more suitable for the real world. Experimental results show that the performance is increased by over 45% in the virtual game world, and can converge quickly and stably in Gazebo in which previous methods can hardly converge.

[1]  Ashish Mehta,et al.  Learning End-to-end Autonomous Driving using Guided Auxiliary Supervision , 2018, ICVGIP.

[2]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[3]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[4]  Ashutosh Saxena,et al.  High speed obstacle avoidance using monocular vision and reinforcement learning , 2005, ICML.

[5]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[6]  Etienne Perot,et al.  End-to-End Deep Reinforcement Learning for Lane Keeping Assist , 2016, ArXiv.

[7]  Yuan Zou,et al.  Reinforcement Learning of Adaptive Energy Management With Transition Probability for a Hybrid Electric Tracked Vehicle , 2015, IEEE Transactions on Industrial Electronics.

[8]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[9]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[10]  Alexey Dosovitskiy,et al.  End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[12]  Tao Tang,et al.  Communication-Based Train Control System Performance Optimization Using Deep Reinforcement Learning , 2017, IEEE Transactions on Vehicular Technology.

[13]  Andreas Stafylopatis,et al.  Autonomous vehicle navigation using evolutionary reinforcement learning , 1998, Eur. J. Oper. Res..

[14]  Mayank Bansal,et al.  ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst , 2018, Robotics: Science and Systems.

[15]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[16]  Bin Shuai,et al.  Multi-step reinforcement learning for model-free predictive energy management of an electrified off-highway vehicle , 2019 .

[17]  Yann LeCun,et al.  Off-Road Obstacle Avoidance through End-to-End Learning , 2005, NIPS.

[18]  Christos Dimitrakakis,et al.  TORCS, The Open Racing Car Simulator , 2005 .

[19]  Martin A. Riedmiller,et al.  Autonomous reinforcement learning on raw visual input data in a real world application , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[20]  Nan Zhao,et al.  Integrated Networking, Caching, and Computing for Connected Vehicles: A Deep Reinforcement Learning Approach , 2018, IEEE Transactions on Vehicular Technology.

[21]  David Janz,et al.  Learning to Drive in a Day , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[22]  Andreas Stafylopatis,et al.  Collision-Free Movement of an Autonomous Vehicle Using Reinforcement Learning , 1992, ECAI.

[23]  Xin Li,et al.  Reinforcement learning based overtaking decision-making for highway autonomous driving , 2015, 2015 Sixth International Conference on Intelligent Control and Information Processing (ICICIP).

[24]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[25]  Kyunghyun Cho,et al.  Query-Efficient Imitation Learning for End-to-End Simulated Driving , 2017, AAAI.

[26]  Jingda Wu,et al.  Energy management based on reinforcement learning with double deep Q-learning for a hybrid electric tracked vehicle , 2019, Applied Energy.

[27]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[28]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[29]  Martin A. Riedmiller,et al.  Learning to Drive a Real Car in 20 Minutes , 2007, 2007 Frontiers in the Convergence of Bioscience and Information Technologies.

[30]  Yang Gao,et al.  End-to-End Learning of Driving Models from Large-Scale Video Datasets , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[32]  Teng Liu,et al.  Reinforcement Learning for Hybrid and Plug-In Hybrid Electric Vehicle Energy Management: Recent Advances and Prospects , 2019, IEEE Industrial Electronics Magazine.

[33]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[34]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[35]  J. Millán,et al.  A Reinforcement Connectionist Approach to Robot Path Finding in Non-Maze-Like Environments , 2004, Machine Learning.

[36]  Jianxiong Xiao,et al.  DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[38]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[39]  Ching-Yao Chan,et al.  A Reinforcement Learning Based Approach for Automated Lane Change Maneuvers , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[40]  Rishi Bedi,et al.  Deep Reinforcement Learning for Simulated Autonomous Vehicle Control , 2016 .

[41]  Andreas Geiger,et al.  Conditional Affordance Learning for Driving in Urban Environments , 2018, CoRL.