Deep Reinforcement Learning for Traffic Signal Control: A Review

Traffic congestion is a complex, vexing, and growing issue day by day in most urban areas worldwide. The integration of the newly emerging deep learning approach and the traditional reinforcement learning approach has created an advanced approach called deep reinforcement learning (DRL) that has shown promising results in solving high-dimensional and complex problems, including traffic congestion. This article presents a review of the attributes of traffic signal control (TSC), as well as DRL architectures and methods applied to TSC, which helps to understand how DRL has been applied to address traffic congestion and achieve performance enhancement. The review also covers simulation platforms, a complexity analysis, as well as guidelines and design considerations for the application of DRL to TSC. Finally, this article presents open issues and new research areas with the objective to spark new interest in this research field. To the best of our knowledge, this is the first review article that focuses on the application of DRL to TSC.

[1]  P. Read Montague,et al.  Reinforcement Learning: An Introduction, by Sutton, R.S. and Barto, A.G. , 1999, Trends in Cognitive Sciences.

[2]  柳琴华,et al.  Traffic signal control system , 2011 .

[3]  Ufuk Topcu,et al.  Safe Reinforcement Learning via Shielding , 2017, AAAI.

[4]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[5]  Kedi Huang,et al.  Deep Reinforcement Learning-Based Traffic Signal Control Using High-Resolution Event-Based Data , 2019, Entropy.

[6]  Peter Vortisch,et al.  Microscopic Traffic Flow Simulator VISSIM , 2010 .

[7]  Vikash V. Gayah,et al.  A Survey on Traffic Signal Control Methods , 2019, ArXiv.

[8]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[9]  A.G. Sims,et al.  The Sydney coordinated adaptive traffic (SCAT) system philosophy and benefits , 1980, IEEE Transactions on Vehicular Technology.

[10]  Shimon Whiteson,et al.  Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[11]  Tao Li,et al.  Adaptive Dynamic Programming for Multi-intersections Traffic Signal Intelligent Control , 2008, 2008 11th International IEEE Conference on Intelligent Transportation Systems.

[12]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[13]  Nahum Shimkin,et al.  Deep Reinforcement Learning with Averaged Target DQN , 2016, ArXiv.

[14]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[15]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[16]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[17]  Sergey Levine,et al.  Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.

[18]  Guy Lever,et al.  Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.

[19]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[20]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[21]  Stefan Krauss,et al.  MICROSCOPIC MODELING OF TRAFFIC FLOW: INVESTIGATION OF COLLISION FREE VEHICLE DYNAMICS. , 1998 .

[22]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[23]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[24]  Zhengyao Jiang,et al.  A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem , 2017, ArXiv.

[25]  Carlos Gershenson,et al.  Self-organizing traffic lights: A realistic simulation , 2006, Advances in Applied Self-organizing Systems.

[26]  Jane X. Wang,et al.  Reinforcement Learning, Fast and Slow , 2019, Trends in Cognitive Sciences.

[27]  Meng Wang,et al.  Urban Traffic Signal Learning Control Using Fuzzy Actor-Critic Methods , 2009, ICNC.

[28]  Zhenhui Li,et al.  IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control , 2018, KDD.

[29]  Nikos A. Vlassis,et al.  Using the Max-Plus Algorithm for Multiagent Decision Making in Coordination Graphs , 2005, BNAIC.

[30]  Craig Boutilier,et al.  Data Efficient Training for Reinforcement Learning with Adaptive Behavior Policy Sharing , 2020, ArXiv.

[31]  Xinkai Wu,et al.  Using high-resolution event-based data for traffic modeling and control: An overview , 2014 .

[32]  Jim Duggan,et al.  An Experimental Review of Reinforcement Learning Algorithms for Adaptive Traffic Signal Control , 2016, Autonomic Road Transport Support Systems.

[33]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[34]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[35]  Etienne Perot,et al.  Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.

[36]  Josep Perarnau,et al.  Traffic Simulation with Aimsun , 2010 .

[37]  Ming Zhou,et al.  Mean Field Multi-Agent Reinforcement Learning , 2018, ICML.

[38]  Markos Papageorgiou,et al.  SOME REMARKS ON MACROSCOPIC TRAFFIC FLOW MODELLING , 1998 .

[39]  Shalabh Bhatnagar,et al.  Decentralized learning for traffic signal control , 2015, 2015 7th International Conference on Communication Systems and Networks (COMSNETS).

[40]  Csaba Szepesvári,et al.  Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[41]  Medhat Moussa,et al.  Deep Learning for Intelligent Transportation Systems: A Survey of Emerging Trends , 2020, IEEE Transactions on Intelligent Transportation Systems.

[42]  Tamer Basar,et al.  Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents , 2018, ICML.

[43]  Nong Ye,et al.  Onset of traffic congestion in complex networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[44]  T. Bosse,et al.  Video Demo: Deep Reinforcement Learning for Coordination in Traffic Light Control , 2017 .

[45]  Li Li,et al.  Traffic signal timing via deep reinforcement learning , 2016, IEEE/CAA Journal of Automatica Sinica.

[46]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[47]  Wade Genders Deep Reinforcement Learning Adaptive Traffic Signal Control , 2018 .

[48]  Baher Abdulhai,et al.  Reinforcement learning for true adaptive traffic signal control , 2003 .

[49]  Pitu B. Mirchandani,et al.  A REAL-TIME TRAFFIC SIGNAL CONTROL SYSTEM: ARCHITECTURE, ALGORITHMS, AND ANALYSIS , 2001 .

[50]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[51]  Juan C. Medina,et al.  Traffic signal control using reinforcement learning and the max-plus algorithm as a coordinating strategy , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[52]  Tao Tang,et al.  Big Data Analytics in Intelligent Transportation Systems: A Survey , 2019, IEEE Transactions on Intelligent Transportation Systems.

[53]  Shalabh Bhatnagar,et al.  Threshold Tuning Using Stochastic Optimization for Graded Signal Control , 2012, IEEE Transactions on Vehicular Technology.

[54]  Xiangxiang Chu,et al.  Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning , 2017, ArXiv.

[55]  Minoru Ito,et al.  Adaptive Traffic Signal Control: Deep Reinforcement Learning Algorithm with Experience Replay and Target Network , 2017, ArXiv.

[56]  Andre Esteva,et al.  A guide to deep learning in healthcare , 2019, Nature Medicine.

[57]  Wu Wei,et al.  Traffic signal control using fuzzy logic and MOGA , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).

[58]  Tianshu Chu,et al.  Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control , 2019, IEEE Transactions on Intelligent Transportation Systems.

[59]  Frans A. Oliehoek,et al.  Coordinated Deep Reinforcement Learners for Traffic Light Control , 2016 .

[60]  Jordi Casas,et al.  Dynamic Network Simulation with AIMSUN , 2005 .

[61]  Demetris Stathakis,et al.  How many hidden layers and nodes? , 2009 .

[62]  Daniel Krajzewicz,et al.  SUMO - Simulation of Urban MObility An Overview , 2011 .

[63]  Paulo Martins Engel,et al.  Dealing with continuous-state reinforcement learning for intelligent control of traffic signals , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[64]  Zhiyong Liu,et al.  A Survey of Intelligence Methods in Urban Traffic Signal Control , 2007 .

[65]  R. D. Bretherton,et al.  Optimizing networks of traffic signals in real time-the SCOOT method , 1991 .

[66]  Chin Kian Keong THE GLIDE SYSTEM : SINGAPORE'S URBAN TRAFFIC CONTROL SYSTEM , 1993 .

[67]  Sudhanshu Tripathi,et al.  Time Optimization for Traffic Signal Control Using Genetic Algorithm , 2009 .

[68]  Christian Bettstetter,et al.  On the Message and Time Complexity of a Distributed Mobility – Adaptive Clustering Algorithm in Wireless Ad Hoc Networks , 2001 .

[69]  Marcin Andrychowicz,et al.  Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[70]  Mohamed Abdel-Aty,et al.  Decentralized network level adaptive signal control by multi-agent deep reinforcement learning , 2019, Transportation Research Interdisciplinary Perspectives.

[71]  Gordon D. B. Cameron,et al.  PARAMICS—Parallel microscopic simulation of road traffic , 1996, The Journal of Supercomputing.

[72]  Daniel Krajzewicz,et al.  SUMO (Simulation of Urban MObility) - an open-source traffic simulation , 2002 .

[73]  Saiedeh N. Razavi,et al.  Using a Deep Reinforcement Learning Agent for Traffic Signal Control , 2016, ArXiv.

[74]  R. Luttinen STATISTICAL ANALYSIS OF VEHICLE TIME HEADWAYS , 1996 .

[75]  Francois Dion,et al.  A rule-based real-time traffic responsive signal control system with transit priority: application to an isolated intersection , 2002 .

[76]  Sergey Levine,et al.  Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[77]  Peter Hidas,et al.  MODELLING LANE CHANGING AND MERGING IN MICROSCOPIC TRAFFIC SIMULATION , 2002 .

[78]  Kok-Lim Alvin Yau,et al.  Deep reinforcement learning for traffic signal control under disturbances: A case study on Sunway city, Malaysia , 2020, Future Gener. Comput. Syst..

[79]  Yifei Wei,et al.  Energy Efficient Training Task Assignment Scheme for Mobile Distributed Deep Learning Scenario Using DQN , 2019, 2019 IEEE 7th International Conference on Computer Science and Network Technology (ICCSNT).

[80]  Tomohiro Harada,et al.  Self-Play for Training General Fighting Game AI , 2019, 2019 Nicograph International (NicoInt).

[81]  Lei Cao,et al.  Ensemble Network Architecture for Deep Reinforcement Learning , 2018 .

[82]  Yuxi Li,et al.  Deep Reinforcement Learning: An Overview , 2017, ArXiv.

[83]  Stefania Bandini,et al.  A Deep Reinforcement Learning Approach to Adaptive Traffic Lights Management , 2019, WOA.

[84]  Abdellah El Moudni,et al.  Traffic network micro-simulation model and control algorithm based on approximate dynamic programming , 2016 .

[85]  Vijay Janapa Reddi,et al.  Deep Reinforcement Learning for Cyber Security , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[86]  Dongbin Zhao,et al.  Computational Intelligence in Urban Traffic Signal Control: A Survey , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[87]  David Rolnick,et al.  Experience Replay for Continual Learning , 2018, NeurIPS.

[88]  Yuyang Liu,et al.  DQN with model-based exploration: efficient learning on environments with sparse rewards , 2019, ArXiv.

[89]  Xiaoqing Han,et al.  Review on the research and practice of deep learning and reinforcement learning in smart grids , 2018, CSEE Journal of Power and Energy Systems.

[90]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[91]  R. B. Gopaluni,et al.  Deep reinforcement learning approaches for process control , 2017, 2017 6th International Symposium on Advanced Control of Industrial Processes (AdCONIP).

[92]  Ana L. C. Bazzan,et al.  Opportunities for multiagent systems and multiagent reinforcement learning in traffic control , 2009, Autonomous Agents and Multi-Agent Systems.

[93]  Baher Abdulhai,et al.  Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and Large-Scale Application on Downtown Toronto , 2013, IEEE Transactions on Intelligent Transportation Systems.

[94]  Akanksha Rai Sharma,et al.  Literature survey of statistical, deep and reinforcement learning in natural language processing , 2017, 2017 International Conference on Computing, Communication and Automation (ICCCA).

[95]  Mark Smith,et al.  PARAMICS: microscopic traffic simulation for congestion management , 1995 .

[96]  Ning Zhang,et al.  Deep Reinforcement Learning-Based Image Captioning with Embedding Reward , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[97]  Peter Corcoran,et al.  Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning , 2017, ArXiv.

[98]  Yang Gao,et al.  Reinforcement Learning from Imperfect Demonstrations , 2018, ICLR.

[99]  Michael Schreckenberg,et al.  A cellular automaton model for freeway traffic , 1992 .

[100]  Shimon Whiteson,et al.  Approximate solutions for factored Dec-POMDPs with many agents , 2013, AAMAS.

[101]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[102]  David Isele,et al.  CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning , 2018, ICLR.

[103]  John N. Tsitsiklis,et al.  Actor-Critic Algorithms , 1999, NIPS.

[104]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[105]  Chia-Hao Wan,et al.  Value‐based deep reinforcement learning for adaptive isolated intersection signal control , 2018, IET Intelligent Transport Systems.

[106]  William Hsu,et al.  Analysis and Improvement of Adversarial Training in DQN Agents With Adversarially-Guided Exploration (AGE) , 2019, ArXiv.

[107]  Mee Hong Ling,et al.  A Survey on Reinforcement Learning Models and Algorithms for Traffic Signal Control , 2017, ACM Comput. Surv..

[108]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.