A Deep Adaptive Traffic Signal Controller With Long-Term Planning Horizon and Spatial-Temporal State Definition Under Dynamic Traffic Fluctuations

This study proposes a new adaptive traffic signal control scheme to effectively manage dynamically fluctuating traffic flows through intersections. A spatial-temporal representation of the traffic state at an intersection has been designed to efficiently identify traffic patterns from complex intersection environments, and a deep neural network (long short-term memory network, LSTM) is used to determine look-ahead signal control decisions based on the estimated long-term feedback from a given traffic state. The actor-critic algorithm, one of the reinforcement learning-based algorithms, is adopted to obtain the essential parameters of the LSTM deep neural network through multiple interactions between a simulated environment and the corresponding adaptive traffic signal controller. A realistic model environment comprising a 24-hour time-varying traffic demand including rush hour and non-rush hour situations served as the basis for traffic generation in the numerical experiments to confirm the effectiveness of the proposed scheme. The results of these experiments show that, compared to an optimized fixed time plan (Synchro), the proposed scheme can reduce waiting times at intersections by an astounding 50% with consequential benefits of reducing fuel consumptions, emissions, queue lengths, and vehicle delays whilst increasing mean speeds.

[1]  Nan Xiao,et al.  Iterative Tuning With Reactive Compensation for Urban Traffic Signal Control , 2017, IEEE Transactions on Control Systems Technology.

[2]  Jean-Loup Farges,et al.  Design by Petri nets of an intersection signal controller , 1996 .

[3]  Yiheng Feng,et al.  A real-time adaptive signal control in a connected vehicle environment , 2015 .

[4]  Suvrajeet Sen,et al.  Controlled Optimization of Phases at an Intersection , 1997, Transp. Sci..

[5]  Marios M. Polycarpou,et al.  Towards distributed online cooperative traffic signal control using the cell transmission model , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[6]  Jin Yu,et al.  Natural Actor-Critic for Road Traffic Optimisation , 2006, NIPS.

[7]  Hong K. Lo,et al.  A Cell-Based Traffic Control Formulation: Strategies and Benefits of Dynamic Timing Plans , 2001, Transp. Sci..

[8]  Wang,et al.  Review of road traffic control strategies , 2003, Proceedings of the IEEE.

[9]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[10]  Wei Ni,et al.  Cordon control with spatially-varying metering rates: A Reinforcement Learning approach , 2019, Transportation Research Part C: Emerging Technologies.

[11]  Yafeng Yin,et al.  Robust optimal traffic signal timing , 2008 .

[12]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[13]  Ana L. C. Bazzan,et al.  Learning in groups of traffic signals , 2010, Eng. Appl. Artif. Intell..

[14]  Tehseen Zia,et al.  Long short-term memory recurrent neural network architectures for Urdu acoustic modeling , 2018, Int. J. Speech Technol..

[15]  Kyandoghere Kyamakya,et al.  Recent Advances in Nonlinear Dynamics and Synchronization , 2009 .

[16]  Hong Kam Lo,et al.  Dynamic network traffic control , 2001 .

[17]  Marco Wiering,et al.  Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events , 2017 .

[18]  J.-J. Henry PRODYN tests and future experiments on ZELT , 1989, Conference Record of papers presented at the First Vehicle Navigation and Information Systems Conference (VNIS '89).

[19]  Dongbin Zhao,et al.  Computational Intelligence in Urban Traffic Signal Control: A Survey , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[20]  Xiqun Chen,et al.  Short-Term Forecasting of Passenger Demand under On-Demand Ride Services: A Spatio-Temporal Deep Learning Approach , 2017, ArXiv.

[21]  John D. C. Little,et al.  MAXBAND : a versatile program for setting signals on arteries and triangular networks , 1981 .

[22]  John Langford,et al.  Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.

[23]  Zhang Yi,et al.  Multiobjective Reinforcement Learning for Traffic Signal Control Using Vehicular Ad Hoc Network , 2010, EURASIP J. Adv. Signal Process..

[24]  Frans A. Oliehoek,et al.  Coordinated Deep Reinforcement Learners for Traffic Light Control , 2016 .

[25]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[26]  Loo Hay Lee,et al.  Enhancing transportation systems via deep learning: A survey , 2019, Transportation Research Part C: Emerging Technologies.

[27]  Tie-Yan Liu,et al.  An Actor-critic Algorithm for Learning Rate Learning , 2017 .

[28]  Timoteo Carletti,et al.  A dynamic behavioural traffic assignment model with strategic agents , 2017 .

[29]  R D Bretherton,et al.  SCOOT-a Traffic Responsive Method of Coordinating Signals , 1981 .

[30]  Mee Hong Ling,et al.  A Survey on Reinforcement Learning Models and Algorithms for Traffic Signal Control , 2017, ACM Comput. Surv..

[31]  R A Vincent,et al.  Self-optimising traffic signal control using microprocessors. The TRRL MOVA strategy for isolated intersections , 1986 .

[32]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[33]  Hong Kam Lo,et al.  A novel traffic signal control formulation , 1999 .

[34]  Marco Wiering,et al.  Multi-Agent Reinforcement Learning for Traffic Light control , 2000 .

[35]  Markos Papageorgiou,et al.  A rolling-horizon quadratic-programming approach to the signal control problem in large-scale conges , 2009 .

[36]  Chen Cai,et al.  Adaptive traffic signal control using approximate dynamic programming , 2009 .

[37]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[38]  Pitu B. Mirchandani,et al.  A REAL-TIME TRAFFIC SIGNAL CONTROL SYSTEM: ARCHITECTURE, ALGORITHMS, AND ANALYSIS , 2001 .

[39]  Jay H. Lee,et al.  Machine learning: Overview of the recent progresses and implications for the process systems engineering field , 2017, Comput. Chem. Eng..

[40]  Daniel Krajzewicz,et al.  Recent Development and Applications of SUMO - Simulation of Urban MObility , 2012 .

[41]  Abbas Khosravi,et al.  A review on computational intelligence methods for controlling traffic signal timing , 2015, Expert Syst. Appl..

[42]  Nathan H. Gartner,et al.  Optimized Policies for Adaptive Control Strategy in Real-Time Traffic Adaptive Control Systems: Implementation and Field Testing , 2002 .

[43]  A.G. Sims,et al.  The Sydney coordinated adaptive traffic (SCAT) system philosophy and benefits , 1980, IEEE Transactions on Vehicular Technology.

[44]  Bram Bakker,et al.  Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.

[45]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[46]  Jim Duggan,et al.  An Experimental Review of Reinforcement Learning Algorithms for Adaptive Traffic Signal Control , 2016, Autonomic Road Transport Support Systems.

[47]  D I Robertson,et al.  "TRANSYT" METHOD FOR AREA TRAFFIC CONTROL , 1969 .