Evaluating reinforcement learning state representations for adaptive traffic signal control

Abstract Reinforcement learning has shown potential for developing effective adaptive traffic signal controllers to reduce traffic congestion and improve mobility. Despite many successful research studies, few of these ideas have been implemented in practice. There remains uncertainty about what the requirements are in terms of data and sensors to actualize reinforcement learning traffic signal control. We seek to understand the data requirements and the performance differences in different state representations for reinforcement learning traffic signal control. We model three state representations, from low to high-resolution, and compare their performance using the asynchronous advantage actor-critic algorithm with neural network function approximation in simulation. Results show that low-resolution state representations (e.g., occupancy and average speed) perform almost identically to high-resolution state representations (e.g., individual vehicle position and speed). These results indicate implementing reinforcement learning traffic signal controllers may be possible with conventional sensors, such as loop detectors, and do not require sophisticated sensors, such as cameras or radar.

[1]  Marc G. Bellemare,et al.  A Distributional Perspective on Reinforcement Learning , 2017, ICML.

[2]  Takayoshi Yoshimura,et al.  Traffic Signal Control Based on Reinforcement Learning with Graph Convolutional Neural Nets , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[3]  Baher Abdulhai,et al.  Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and Large-Scale Application on Downtown Toronto , 2013, IEEE Transactions on Intelligent Transportation Systems.

[4]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[5]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[6]  Paul J. Werbos,et al.  Applications of advances in nonlinear sensitivity analysis , 1982 .

[7]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[8]  Monireh Abdoos,et al.  Holonic multi-agent system for traffic signals control , 2013, Eng. Appl. Artif. Intell..

[9]  A Schadschneider,et al.  Optimizing traffic lights in a cellular automaton model for city traffic. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Jim Duggan,et al.  An Experimental Review of Reinforcement Learning Algorithms for Adaptive Traffic Signal Control , 2016, Autonomic Road Transport Support Systems.

[11]  A. Koopman,et al.  Simulation and optimization of traffic in a city , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[12]  Baher Abdulhai,et al.  Design of Reinforcement Learning Parameters for Seamless Application of Adaptive Traffic Signal Control , 2014, J. Intell. Transp. Syst..

[13]  Xiaoliang Ma,et al.  A group-based traffic signal control with adaptive learning ability , 2017, Eng. Appl. Artif. Intell..

[14]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[15]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[16]  Lei Liu,et al.  Intelligent traffic light control using distributed multi-agent Q learning , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[17]  Satish V. Ukkusuri,et al.  A junction-tree based learning algorithm to optimize network wide traffic control: A coordinated multi-agent framework , 2015 .

[18]  Marc G. Bellemare,et al.  The Reactor: A Sample-Efficient Actor-Critic Architecture , 2017, ArXiv.

[19]  Baher Abdulhai,et al.  Reinforcement learning for true adaptive traffic signal control , 2003 .

[20]  Noe Casas,et al.  Deep Deterministic Policy Gradient for Urban Traffic Light Control , 2017, ArXiv.

[21]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[22]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[23]  David Budden,et al.  Distributed Prioritized Experience Replay , 2018, ICLR.

[24]  Marc G. Bellemare,et al.  Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Saiedeh Razavi,et al.  Evaluating Reinforcement Learning State Representations for Adaptive Traffic Signal Control , 2019, International Journal of Traffic and Transportation Management.

[27]  Saiedeh N. Razavi,et al.  Using a Deep Reinforcement Learning Agent for Traffic Signal Control , 2016, ArXiv.

[28]  Marco Wiering,et al.  Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events , 2017 .

[29]  Nando de Freitas,et al.  Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.

[30]  Trevor Reed,et al.  INRIX Global Traffic Scorecard , 2019 .

[31]  Longxin Lin Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.

[32]  Thomas L. Thorpe,et al.  Traac Light Control Using Sarsa with Three State Representations , 1996 .

[33]  Daniel Krajzewicz,et al.  Recent Development and Applications of SUMO - Simulation of Urban MObility , 2012 .

[34]  Marco Wiering,et al.  Multi-Agent Reinforcement Learning for Traffic Light control , 2000 .

[35]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[36]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[37]  Chia-Hao Wan,et al.  Value‐based deep reinforcement learning for adaptive isolated intersection signal control , 2018, IET Intelligent Transport Systems.

[38]  Mee Hong Ling,et al.  A Survey on Reinforcement Learning Models and Algorithms for Traffic Signal Control , 2017, ACM Comput. Surv..

[39]  Li Li,et al.  Traffic signal timing via deep reinforcement learning , 2016, IEEE/CAA Journal of Automatica Sinica.

[40]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.