论文信息 - Large-scale traffic grid signal control with regional Reinforcement Learning

Large-scale traffic grid signal control with regional Reinforcement Learning

Reinforcement learning (RL) based traffic signal control for large-scale traffic grids is challenging due to the curse of dimensionality. Most particularly, searching for an optimal policy in a huge action space is impractical, even with approximate Q-functions. On the other hand, heuristic self-organizing algorithms could achieve efficient decentralized control, but most of them have few effort on optimizing the real-time traffic. This paper proposes a new regional RL algorithm that could form local cooperation regions adaptively, and then learn the optimal control policy for each region separately. In particular, we maintain a set of learning parameters to capture the control patterns in regions at different scales. At each time step, we first decompose the large-scale traffic grid into disjoint sub-regions, depending on the real-time traffic condition. Next, we apply approximate Q-learning to learn the centralized control policy within each sub-region, by updating the corresponding learning parameters upon traffic observations. The numerical experiments demonstrate that our regional RL algorithm is computationally efficient and functionally adaptive, and it outperforms typical heuristic decentralized algorithms.

[1] Dirk Helbing,et al. Self-control of traffic lights and vehicle flows in urban road networks , 2008, 0802.0403.

[2] Tung Le,et al. Decentralized signal control for urban road networks , 2013, 1310.0491.

[3] Richard M. Leahy,et al. An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[4] Stephen F. Smith,et al. Platoon-based self-scheduling for real-time traffic signal control , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[5] Marco Wiering,et al. Multi-Agent Reinforcement Learning for Traffic Light control , 2000 .

[6] Shalabh Bhatnagar,et al. Reinforcement Learning With Function Approximation for Traffic Signal Control , 2011, IEEE Transactions on Intelligent Transportation Systems.

[7] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[8] Chen Cai,et al. Adaptive traffic signal control using approximate dynamic programming , 2009 .

[9] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.

[10] Jie Wang,et al. Traffic signal control with macroscopic fundamental diagrams , 2015, 2015 American Control Conference (ACC).

[11] Jian Cao,et al. Kernel-based reinforcement learning for traffic signal control with adaptive feature selection , 2014, 53rd IEEE Conference on Decision and Control.

[12] Yuxuan Ji,et al. Spatial and Temporal Analysis of Congestion in Urban Transportation Networks , 2011 .

[13] R. Bellman. A Markovian Decision Process , 1957 .

[14] Ella Bingham. Reinforcement learning in neurofuzzy traffic signal control , 2001, Eur. J. Oper. Res..

[15] Arne Koopman,et al. Intelligent Traffic Light Control , 2004 .

[16] Carlos Gershenson,et al. Self-organizing Traffic Lights , 2004, Complex Syst..

[17] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[18] Jitendra Malik,et al. Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.