Large-scale traffic grid signal control with regional Reinforcement Learning

Reinforcement learning (RL) based traffic signal control for large-scale traffic grids is challenging due to the curse of dimensionality. Most particularly, searching for an optimal policy in a huge action space is impractical, even with approximate Q-functions. On the other hand, heuristic self-organizing algorithms could achieve efficient decentralized control, but most of them have few effort on optimizing the real-time traffic. This paper proposes a new regional RL algorithm that could form local cooperation regions adaptively, and then learn the optimal control policy for each region separately. In particular, we maintain a set of learning parameters to capture the control patterns in regions at different scales. At each time step, we first decompose the large-scale traffic grid into disjoint sub-regions, depending on the real-time traffic condition. Next, we apply approximate Q-learning to learn the centralized control policy within each sub-region, by updating the corresponding learning parameters upon traffic observations. The numerical experiments demonstrate that our regional RL algorithm is computationally efficient and functionally adaptive, and it outperforms typical heuristic decentralized algorithms.

[1]  Dirk Helbing,et al.  Self-control of traffic lights and vehicle flows in urban road networks , 2008, 0802.0403.

[2]  Tung Le,et al.  Decentralized signal control for urban road networks , 2013, 1310.0491.

[3]  Richard M. Leahy,et al.  An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Stephen F. Smith,et al.  Platoon-based self-scheduling for real-time traffic signal control , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[5]  Marco Wiering,et al.  Multi-Agent Reinforcement Learning for Traffic Light control , 2000 .

[6]  Shalabh Bhatnagar,et al.  Reinforcement Learning With Function Approximation for Traffic Signal Control , 2011, IEEE Transactions on Intelligent Transportation Systems.

[7]  Csaba Szepesvári,et al.  Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[8]  Chen Cai,et al.  Adaptive traffic signal control using approximate dynamic programming , 2009 .

[9]  John Langford,et al.  Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.

[10]  Jie Wang,et al.  Traffic signal control with macroscopic fundamental diagrams , 2015, 2015 American Control Conference (ACC).

[11]  Jian Cao,et al.  Kernel-based reinforcement learning for traffic signal control with adaptive feature selection , 2014, 53rd IEEE Conference on Decision and Control.

[12]  Yuxuan Ji,et al.  Spatial and Temporal Analysis of Congestion in Urban Transportation Networks , 2011 .

[13]  R. Bellman A Markovian Decision Process , 1957 .

[14]  Ella Bingham Reinforcement learning in neurofuzzy traffic signal control , 2001, Eur. J. Oper. Res..

[15]  Arne Koopman,et al.  Intelligent Traffic Light Control , 2004 .

[16]  Carlos Gershenson,et al.  Self-organizing Traffic Lights , 2004, Complex Syst..

[17]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[18]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.