Efficient Lane-changing Behavior Planning via Reinforcement Learning with Imitation Learning Initialization

Robust lane-changing behavior planning is critical to ensuring the safety and comfort of autonomous vehicles. In this paper, we proposed an efficient and robust vehicle lane-changing behavior decision-making method based on reinforcement learning (RL) and imitation learning (IL) initialization which learns the potential lane-changing driving mechanisms from driving mechanism from the interactions between vehicle and environment, so as to simplify the manual driving modeling and have good adaptability to the dynamic changes of lane-changing scene. Our method further makes the following improvements on the basis of the Proximal Policy Optimization (PPO) algorithm: (1) A dynamic hybrid reward mechanism for lane-changing tasks is adopted; (2) A state space construction method based on fuzzy logic and deformation pose is presented to enable behavior planning to learn more refined tactical decision-making; (3) An RL initialization method based on imitation learning which only requires a small amount of scene data is introduced to solve the low efficiency of RL learning under sparse reward. Experiments on the SUMO show the effectiveness of the proposed method, and the test on the CARLA simulator also verifies the generalization ability of the method.

[1]  K. Song,et al.  An Optimal Vehicle Speed Planning Algorithm for Regenerative Braking at Traffic Lights Intersections based on Reinforcement Learning , 2020, 2020 4th CAA International Conference on Vehicular Control and Intelligence (CVCI).

[2]  Chi-Sheng Shih,et al.  Proactive Car-Following Using Deep-Reinforcement Learning , 2020, 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC).

[3]  Yue Wang,et al.  Learning hierarchical behavior and motion planning for autonomous driving , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Ahmetcan Erdogan,et al.  Sample Efficient Interactive End-to-End Deep Learning for Self-Driving Cars with Selective Multi-Class Safe Dataset Aggregation , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Nazim Kemal Ure,et al.  Automated Lane Change Decision Making using Deep Reinforcement Learning in Dynamic and Uncertain Highway Environment , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[6]  Gaetan Le-Gall,et al.  Imitation Learning for End to End Vehicle Longitudinal Control with Forward Camera , 2018, ArXiv.

[7]  Fawzi Nashashibi,et al.  End-to-End Race Driving with Deep Reinforcement Learning , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Ching-Yao Chan,et al.  A Reinforcement Learning Based Approach for Automated Lane Change Maneuvers , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[9]  V. Koltun,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[10]  Marcin Andrychowicz,et al.  Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[12]  David Isele,et al.  Navigating Occluded Intersections with Autonomous Vehicles Using Deep Reinforcement Learning , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[14]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[15]  Aaron C. Courville,et al.  Generative adversarial networks , 2014, Commun. ACM.

[16]  Daniel Krajzewicz,et al.  SUMO - Simulation of Urban MObility An Overview , 2011 .

[17]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[18]  Sebastian Thrun,et al.  Junior: The Stanford entry in the Urban Challenge , 2008, J. Field Robotics.

[19]  Peter King,et al.  Odin: Team VictorTango's entry in the DARPA Urban Challenge , 2008, J. Field Robotics.

[20]  Claudia J. Stanny,et al.  Effects of distraction and experience on situation awareness and simulated driving , 2007 .

[21]  Francesco Borrelli,et al.  Predictive Active Steering Control for Autonomous Vehicle Systems , 2007, IEEE Transactions on Control Systems Technology.

[22]  Francesco Borrelli,et al.  MPC-Based Approach to Active Steering for Autonomous Vehicle Systems , 2005 .

[23]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[24]  Peter Stone,et al.  Reinforcement learning , 2019, Scholarpedia.

[25]  P. Abbeel,et al.  Inverse Reinforcement Learning , 2010, Encyclopedia of Machine Learning.