论文信息 - Encoding Integrated Decision and Control for Autonomous Driving with Mixed Traffic Flow

Encoding Integrated Decision and Control for Autonomous Driving with Mixed Traffic Flow

Reinforcement learning (RL) has been widely adopted to make intelligent driving policy in autonomous driving due to the self-evolution ability and humanoid learning paradigm. Despite many elegant demonstrations of RL-enabled decisionmaking, current research mainly focuses on the pure vehicle driving environment while ignoring other traffic participants like bicycles and pedestrians. For urban roads, the interaction of mixed traffic flows leads to a quite dynamic and complex relationship, which poses great difficulty to learn a safe and intelligent policy. This paper proposes the encoding integrated decision and control (E-IDC) to handle complicated driving tasks with mixed traffic flows, which composes of an encoding function to construct driving states, a value function to choose the optimal path as well as a policy function to output the control command of ego vehicle. Specially, the encoding function is capable of dealing with different types and variant number of traffic participants and extracting features from original driving observation. Next, we design the training principle for the functions of E-IDC with RL algorithms by adding the gradient-based update rules and refine the safety constraints concerning the otherness of different participants. The verification is conducted on the intersection scenario with mixed traffic flows and result shows that E-IDC can enhance the driving performance, including the tracking performance and safety constraint requirements with a large margin. The online application indicates that E-IDC can realize efficient and smooth driving in the complex intersection, guaranteeing the intelligence and safety simultaneously.

[1] Germán Ros,et al. CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[2] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[3] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[4] WächterAndreas,et al. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming , 2006 .

[5] Masayoshi Tomizuka,et al. Model-free Deep Reinforcement Learning for Urban Autonomous Driving , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[6] Jingliang Duan,et al. Integrated Decision and Control: Towards Interpretable and Efficient Driving Intelligence , 2021, ArXiv.

[7] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[8] Etienne Perot,et al. End-to-End Driving in a Realistic Racing Game with Deep Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9] Jingliang Duan,et al. Fixed-Dimensional and Permutation Invariant State Representation of Autonomous Driving , 2021, IEEE Transactions on Intelligent Transportation Systems.

[10] Victor Talpaert,et al. Deep Reinforcement Learning for Autonomous Driving: A Survey , 2020, IEEE Transactions on Intelligent Transportation Systems.

[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12] Johann Marius Zöllner,et al. Learning how to drive in a real world simulation with deep Q-Networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[13] Ching-Yao Chan,et al. Continuous Control for Automated Lane Change Behavior Based on Deep Deterministic Policy Gradient Algorithm , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[14] Yuming Yin,et al. Integrated decision and control at multi-lane intersections with mixed traffic flow , 2021, Journal of Physics: Conference Series.

[15] David Janz,et al. Learning to Drive in a Day , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[16] Marin Toromanoff,et al. End-to-End Model-Free Reinforcement Learning for Urban Driving Using Implicit Affordances , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[18] Manfred Morari,et al. Model predictive control: Theory and practice - A survey , 1989, Autom..

[19] Sifa Zheng,et al. Numerically Stable Dynamic Bicycle Model for Discrete-time Control , 2020, 2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops).

[20] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[21] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.

[22] Qi Sun,et al. Centralized Cooperation for Connected and Automated Vehicles at Intersections by Proximal Policy Optimization , 2020, IEEE Transactions on Vehicular Technology.

[23] Matthias Althoff,et al. High-level Decision Making for Safe and Reasonable Autonomous Lane Changing using Reinforcement Learning , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[24] Rubo Zhang,et al. Inverse Reinforcement Learning via Neural Network in Driver Behavior Modeling , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[25] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .

[26] Anne Spalanzani,et al. Deep Reinforcement Learning based Vehicle Navigation amongst pedestrians using a Grid-based state representation* , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[27] Yun-Pang Flötteröd,et al. Microscopic Traffic Simulation using SUMO , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[28] Kaan Ozbay,et al. Reinforcement-Learning-Based Cooperative Adaptive Cruise Control of Buses in the Lincoln Tunnel Corridor with Time-Varying Topology , 2019, IEEE Transactions on Intelligent Transportation Systems.

[29] Junmin Wang,et al. A Novel Vehicle Tracking Method for Cross-Area Sensor Fusion with Reinforcement Learning Based GMM , 2020, 2020 American Control Conference (ACC).

[30] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.