Safe, Efficient, and Comfortable Velocity Control based on Reinforcement Learning for Autonomous Driving

Abstract A model used for velocity control during car following is proposed based on reinforcement learning (RL). To optimize driving performance, a reward function is developed by referencing human driving data and combining driving features related to safety, efficiency, and comfort. With the developed reward function, the RL agent learns to control vehicle speed in a fashion that maximizes cumulative rewards, through trials and errors in the simulation environment. To avoid potential unsafe actions, the proposed RL model is incorporated with a collision avoidance strategy for safety checks. The safety check strategy is used during both model training and testing phases, which results in faster convergence and zero collisions. A total of 1,341 car-following events extracted from the Next Generation Simulation (NGSIM) dataset are used to train and test the proposed model. The performance of the proposed model is evaluated by the comparison with empirical NGSIM data and with adaptive cruise control (ACC) algorithm implemented through model predictive control (MPC). The experimental results show that the proposed model demonstrates the capability of safe, efficient, and comfortable velocity control and outperforms human drivers in that it 1) has larger TTC values than those of human drivers, 2) can maintain efficient and safe headways around 1.2s, and 3) can follow the lead vehicle comfortably with smooth acceleration (jerk value is only a third of that of human drivers). Compared with the MPC-based ACC algorithm, the proposed model has better performance in terms of safety, comfort, and especially running speed during testing (more than 200 times faster). The results indicate that the proposed approach could contribute to the development of better autonomous driving systems. Source code of this paper can be found at https://github.com/MeixinZhu/Velocity_control .

[1]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[2]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[3]  Nakayama,et al.  Dynamical model of traffic congestion and numerical simulation. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[4]  D. Gazis,et al.  Nonlinear Follow-the-Leader Models of Traffic Flow , 1961 .

[5]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[6]  Ming Chen,et al.  Drivers’ rear end collision avoidance behaviors under different levels of situational urgency , 2016 .

[7]  Robert Babuska,et al.  A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[8]  Qingtian Zeng,et al.  Accessibility Analysis and Modeling for IoV in an Urban Scene , 2020, IEEE Transactions on Vehicular Technology.

[9]  Jianqiang Wang,et al.  Model Predictive Multi-Objective Vehicular Adaptive Cruise Control , 2011, IEEE Transactions on Control Systems Technology.

[10]  Yinhai Wang,et al.  Full Bayesian Before-After Analysis of Safety Effects of Variable Speed Limit System , 2020, IEEE Transactions on Intelligent Transportation Systems.

[11]  Marcello Montanino,et al.  Trajectory data reconstruction and simulation-based validation against macroscopic traffic patterns , 2015 .

[12]  Meixin Zhu,et al.  Human-Like Autonomous Car-Following Model with Deep Reinforcement Learning , 2018, Transportation Research Part C: Emerging Technologies.

[13]  Fei-Yue Wang,et al.  Capturing Car-Following Behaviors by Deep Learning , 2018, IEEE Transactions on Intelligent Transportation Systems.

[14]  Meixin Zhu,et al.  Impact on car following behavior of a forward collision warning system with headway monitoring , 2020 .

[15]  Ira D. Jacobson,et al.  MODELS OF HUMAN COMFORT IN VEHICLE ENVIRONMENTS , 1980 .

[16]  Ming Chen,et al.  Development of a Kinematic-Based Forward Collision Warning Algorithm Using an Advanced Driving Simulator , 2016, IEEE Transactions on Intelligent Transportation Systems.

[17]  Daisuke Akasaka,et al.  Model Predictive Control Approach to Design Practical Adaptive Cruise Control for Traffic Jam , 2018 .

[18]  Louis Wehenkel,et al.  Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[19]  Yiik Diew Wong,et al.  Fuzzy Cellular Automata Model for Signalized Intersections , 2015, Comput. Aided Civ. Infrastructure Eng..

[20]  L. A. Pipes An Operational Analysis of Traffic Dynamics , 1953 .

[21]  Mike McDonald,et al.  Car-following: a historical review , 1999 .

[22]  Zuduo Zheng,et al.  Incorporating human-factors in car-following models : a review of recent developments and research needs , 2014 .

[23]  P. G. Gipps,et al.  A behavioural car-following model for computer simulation , 1981 .

[24]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[25]  Meixin Zhu,et al.  Modeling car-following behavior on urban expressways in Shanghai: A naturalistic driving study , 2018, Transportation Research Part C: Emerging Technologies.

[26]  Lei Zhang,et al.  An Adaptive Longitudinal Driving Assistance System Based on Driver Characteristics , 2013, IEEE Transactions on Intelligent Transportation Systems.

[27]  Carlos Bordons Alba,et al.  Model Predictive Control , 2012 .

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  Katja Vogel,et al.  A comparison of headway and time to collision as safety indicators. , 2003, Accident; analysis and prevention.

[30]  Thomas A. Dingus,et al.  Forward-Looking Collision Warning System Performance Guidelines , 1997 .

[31]  Hong Liu,et al.  Model predictive control for adaptive cruise control with multi-objectives: comfort, fuel-economy, safety and car-following , 2010 .

[32]  Yuxi Li,et al.  Deep Reinforcement Learning: An Overview , 2017, ArXiv.

[33]  Mykel J. Kochenderfer,et al.  Imitating driver behavior with generative adversarial networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[34]  Fei-Yue Wang,et al.  Data-Driven Intelligent Transportation Systems: A Survey , 2011, IEEE Transactions on Intelligent Transportation Systems.

[35]  Heng Wei,et al.  Examining Headway Distribution Models with Urban Freeway Loop Event Data , 2007 .

[36]  Helbing,et al.  Congested traffic states in empirical observations and microscopic simulations , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[37]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[38]  G. Uhlenbeck,et al.  On the Theory of the Brownian Motion , 1930 .