Intelligent Traffic Signal Control with Deep Reinforcement Learning at Single Intersection

In this paper, we apply the Proximal Policy Optimization (PPO) algorithm in intelligent traffic signal control at a single intersection with eight lanes and four signal phases. The optimization goal is to minimize the average waiting time of vehicles so as to improve the traffic efficiency of the intersection. Extensive experiments are conducted in Simulation of Urban MObility (SUMO) to evaluate the performance of the proposed algorithm, and compare it with other classic algorithms including Deep Q-network (DQN), Advantage Actor Critic (A2C) and Fixed Time. Simulation results show that the proposed PPO algorithm outperforms the others under various traffic scenarios to different extent. The performance gain is significant under unbalanced traffic where one direction is saturated while the other is not, and becomes marginal when all the directions are saturated or unsaturated. PPO also demonstrates good portability and robustness over time-varying traffic patterns, while implies it could be a preferable option for implementation in real world intelligent traffic signal control systems.

[1]  Yong Li,et al.  Learning Phase Competition for Traffic Signal Control , 2019, CIKM.

[2]  Arne Koopman,et al.  Intelligent Traffic Light Control , 2004 .

[3]  Baher Abdulhai,et al.  Deep Learning vs. Discrete Reinforcement Learning for Adaptive Traffic Signal Control , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[4]  Daniel Krajzewicz,et al.  Recent Development and Applications of SUMO - Simulation of Urban MObility , 2012 .

[5]  Qionghai Dai,et al.  Cooperative Deep Reinforcement Learning for Large-Scale Traffic Grid Signal Control , 2020, IEEE Transactions on Cybernetics.

[6]  Li Li,et al.  Traffic signal timing via deep reinforcement learning , 2016, IEEE/CAA Journal of Automatica Sinica.

[7]  Shalabh Bhatnagar,et al.  Reinforcement Learning With Function Approximation for Traffic Signal Control , 2011, IEEE Transactions on Intelligent Transportation Systems.

[8]  Baher Abdulhai,et al.  Reinforcement learning for true adaptive traffic signal control , 2003 .

[9]  Mee Hong Ling,et al.  A Survey on Reinforcement Learning Models and Algorithms for Traffic Signal Control , 2017, ACM Comput. Surv..

[10]  Deepeka Garg,et al.  Deep Reinforcement Learning for Autonomous Traffic Light Control , 2018, 2018 3rd IEEE International Conference on Intelligent Transportation Engineering (ICITE).

[11]  Shangguan Wei,et al.  RA-TSC: Learning Adaptive Traffic Signal Control Strategy via Deep Reinforcement Learning , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[12]  Zhenhui Li,et al.  IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control , 2018, KDD.

[13]  Sanjay Chawla,et al.  Time Critic Policy Gradient Methods for Traffic Signal Control in Complex and Congested Scenarios , 2019, KDD.

[14]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[15]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[16]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[17]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[18]  Chia-Hao Wan,et al.  Value‐based deep reinforcement learning for adaptive isolated intersection signal control , 2018, IET Intelligent Transport Systems.

[19]  Kok-Lim Alvin Yau,et al.  Deep Reinforcement Learning for Traffic Signal Control: A Review , 2020, IEEE Access.

[20]  Tianshu Chu,et al.  Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control , 2019, IEEE Transactions on Intelligent Transportation Systems.

[21]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[22]  Zhu Han,et al.  A Deep Reinforcement Learning Network for Traffic Light Cycle Control , 2018, IEEE Transactions on Vehicular Technology.