Time Difference Penalized Traffic Signal Timing by LSTM Q-Network to Balance Safety and Capacity at Intersections

The conflict between limited road resources and rapid car ownership makes the traffic signal timing become a pivotal challenge. Emerging studies have been carried out on adaptive signal timing, but most of them still focus on the throughput of intersections, leaving safety and travel experience unconsidered. This paper proposes a time difference penalized traffic signal timing method by reinforcement learning technique to balance safety and throughput capacity in traffic control system. Firstly, a microcosmic state representation is proposed to integrate the dynamics of both traffic lights and road vehicles, including driver behaviors of lane changing, car-following, previous phase of traffic light and its duration. Secondly, an action space, including 8 signal phases, and a behavior-aware reward function are designed to resist the red-light overflow. Finally, a partial long short term memory (LSTM) network is trained to balance traffic efficiency and traveling experience. In the network training, a parallel sampling method is adopted to obtain experience from multiple environments to accelerate the training convergence in practical application. Experimental results show that the proposed method improves the intersection efficiency up to 14.28% compared to the fixed signal timing and 5.26% compared to DQN while getting rid of red-light overflow time.

[1]  Min Zhao,et al.  Cooperative Driving and Lane Changing Modeling for Connected Vehicles in the Vicinity of Traffic Signals: A Cyber-Physical Perspective , 2018, IEEE Access.

[2]  Zhu Han,et al.  A Deep Reinforcement Learning Network for Traffic Light Cycle Control , 2018, IEEE Transactions on Vehicular Technology.

[3]  Marco Wiering,et al.  Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events , 2017 .

[4]  Noe Casas,et al.  Deep Deterministic Policy Gradient for Urban Traffic Light Control , 2017, ArXiv.

[5]  Saiedeh N. Razavi,et al.  Asynchronous n-step Q-learning adaptive traffic signal control , 2019, J. Intell. Transp. Syst..

[6]  Keqiang Li,et al.  Distributed conflict-free cooperation for multiple connected vehicles at unsignalized intersections , 2018, Transportation Research Part C: Emerging Technologies.

[7]  Tao Xu,et al.  Deep Convolutional Neural Network Based ECG Classification System Using Information Fusion and One-Hot Encoding Techniques , 2018, Mathematical Problems in Engineering.

[8]  Mohsen Guizani,et al.  Semisupervised Deep Reinforcement Learning in Support of IoT and Smart City Services , 2018, IEEE Internet of Things Journal.

[9]  Peter Corcoran,et al.  Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning , 2017, ArXiv.

[10]  Bo Cheng,et al.  Eco-Departure of Connected Vehicles With V2X Communication at Signalized Intersections , 2015, IEEE Transactions on Vehicular Technology.

[11]  Keqiang Li,et al.  Distributed Adaptive Sliding Mode Control of Vehicular Platoon With Uncertain Interaction Topology , 2018, IEEE Transactions on Industrial Electronics.

[12]  Alessandro Colombo,et al.  Safety Verification Methods for Human-Driven Vehicles at Traffic Intersections: Optimal Driver-Adaptive Supervisory Control , 2018, IEEE Transactions on Human-Machine Systems.

[13]  Alan J. Miller Settings for Fixed-Cycle Traffic Signals , 1963 .

[14]  Brendan Tran Morris,et al.  Looking at Intersections: A Survey of Intersection Monitoring, Behavior and Safety Analysis of Recent Studies , 2017, IEEE Transactions on Intelligent Transportation Systems.

[15]  Jianfeng Zheng,et al.  A capacity maximization scheme for intersection management with automated vehicles , 2018, Transportation Research Part C: Emerging Technologies.

[16]  Baher Abdulhai,et al.  Reinforcement learning for true adaptive traffic signal control , 2003 .

[17]  Richard S. Sutton,et al.  Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[18]  Thomas L. Thorpe Vehicle Traffic Light Control Using SARSA , 1997 .

[19]  Douglas Gettman,et al.  Balancing Safety and Capacity in an Adaptive Signal Control System—Phase 1 , 2010 .

[20]  MengChu Zhou,et al.  A Two-level Traffic Light Control Strategy for Preventing Incident-Based Urban Traffic Congestion , 2018, IEEE Transactions on Intelligent Transportation Systems.

[21]  Daniel Krajzewicz,et al.  Recent Development and Applications of SUMO - Simulation of Urban MObility , 2012 .

[22]  Ling Huang,et al.  Network-wide traffic signal control based on the discovery of critical nodes and deep reinforcement learning , 2019, J. Intell. Transp. Syst..

[23]  Dipak Ghosal,et al.  Delay-Based Traffic Signal Control for Throughput Optimality and Fairness at an Isolated Intersection , 2018, IEEE Transactions on Vehicular Technology.

[24]  Minoru Ito,et al.  Adaptive Traffic Signal Control: Deep Reinforcement Learning Algorithm with Experience Replay and Target Network , 2017, ArXiv.

[25]  Liang Qi,et al.  A dynamic road incident information delivery strategy to reduce urban traffic congestion , 2018, IEEE/CAA Journal of Automatica Sinica.

[26]  Yugong Luo,et al.  Minimize the Fuel Consumption of Connected Vehicles Between Two Red-Signalized Intersections in Urban Traffic , 2018, IEEE Transactions on Vehicular Technology.

[27]  Tang-Hsien Chang,et al.  Optimal signal timing for an oversaturated intersection , 2000 .

[28]  Li Li,et al.  Traffic signal timing via deep reinforcement learning , 2016, IEEE/CAA Journal of Automatica Sinica.

[29]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[30]  Yiman Du,et al.  Urban air quality, meteorology and traffic linkages: Evidence from a sixteen-day particulate matter pollution event in December 2015, Beijing. , 2017, Journal of environmental sciences.