Decision-Making under On-Ramp merge Scenarios by Distributional Soft Actor-Critic Algorithm

Funding information This work is supported by International Science & Technology Cooperation Program of China under 2019YFE0100200, and also supported by Tsinghua University-Toyota Joint Research Center for AI Technology of Automated Vehicle Merging into the highway from the on-ramp is an essential scenario for automated driving. The decision-making under the scenario needs to balance the safety and efficiency performance to optimize a long-term objective, which is challenging due to the dynamic, stochastic, and adversarial characteristics. The Rulebased methods often lead to conservative driving on this task while the learning-based methods have difficulties meeting the safety requirements. In this paper, we propose an RL-based endto-end decision-making method under a framework of offline training and online correction, called the Shielded Distributional Soft Actor-critic (SDSAC). The SDSAC adopts the policy evaluation with safety consideration and a safety shield parameterized with the barrier function in its offline training and online correction, respectively. These two measures support each other for better safety while not damaging the efficiency performance severely. We verify the SDSAC on an on-ramp merge scenario in simulation. The results show that the SDSAC has the best safety performance compared to baseline algorithms and achieves efficient driving simultaneously.

[1]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[2]  John M. Dolan,et al.  Autonomous vehicle social behavior for highway entrance ramp management , 2013, 2013 IEEE Intelligent Vehicles Symposium (IV).

[3]  Andreas A. Malikopoulos,et al.  A Survey on the Coordination of Connected and Automated Vehicles at Intersections and Merging at Highway On-Ramps , 2017, IEEE Transactions on Intelligent Transportation Systems.

[4]  Francesco Borrelli,et al.  A machine learning approach for personalized autonomous lane change initiation and control , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[5]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[6]  Zhengyu Liu,et al.  Deep adaptive dynamic programming for nonaffine nonlinear optimal control problem with state constraints , 2019, ArXiv.

[7]  Jingliang Duan,et al.  Direct and indirect reinforcement learning , 2019, Int. J. Intell. Syst..

[8]  Shengbo Eben Li,et al.  Addressing Value Estimation Errors in Reinforcement Learning with a State-Action Return Distribution Function , 2020, ArXiv.

[9]  Ronald R. Mourant,et al.  A framework for modeling human-like driving behaviors for autonomous vehicles in driving simulators , 2001, AGENTS '01.

[10]  Ziqing Gu,et al.  Mixed Reinforcement Learning for Efficient Policy Optimization in Stochastic Environments , 2020, 2020 20th International Conference on Control, Automation and Systems (ICCAS).

[11]  Pieter Abbeel,et al.  Constrained Policy Optimization , 2017, ICML.

[12]  Qi Sun,et al.  Centralized Cooperation for Connected and Automated Vehicles at Intersections by Proximal Policy Optimization , 2020, IEEE Transactions on Vehicular Technology.

[13]  Matthias Althoff,et al.  High-level Decision Making for Safe and Reasonable Autonomous Lane Changing using Reinforcement Learning , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[14]  Qi Sun,et al.  Hierarchical Reinforcement Learning for Self-Driving Decision-Making without Reliance on Labeled Driving Data , 2020, IET Intelligent Transport Systems.

[15]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[16]  Ching-Yao Chan,et al.  Formulation of deep reinforcement learning architecture toward autonomous driving for on-ramp merge , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[17]  Yun-Pang Flötteröd,et al.  Microscopic Traffic Simulation using SUMO , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).