Reinforcement learning with average cost for adaptive control of traffic lights at intersections

We propose for the first time two reinforcement learning algorithms with function approximation for average cost adaptive control of traffic lights. One of these algorithms is a version of Q-learning with function approximation while the other is a policy gradient actor-critic algorithm that incorporates multi-timescale stochastic approximation. We show performance comparisons on various network settings of these algorithms with a range of fixed timing algorithms, as well as a Q-learning algorithm with full state representation that we also implement. We observe that whereas (as expected) on a two-junction corridor, the full state representation algorithm shows the best results, this algorithm is not implementable on larger road networks. The algorithm PG-AC-TLC that we propose is seen to show the best overall performance.

[1]  D I Robertson,et al.  TRANSYT: A TRAFFIC NETWORK STUDY TOOL , 1969 .

[2]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[3]  James C. Spall,et al.  TRAFFIC-RESPONSIVE SIGNAL TIMING FOR SYSTEM-WIDE TRAFFIC CONTROL , 1997 .

[4]  D. C. Chin,et al.  Traffic-responsive signal timing for system-wide traffic control , 1997, Proceedings of the 1997 American Control Conference (Cat. No.97CH36041).

[5]  Vivek S. Borkar,et al.  Learning Algorithms for Markov Decision Processes with Average Cost , 2001, SIAM J. Control. Optim..

[6]  Baher Abdulhai,et al.  Reinforcement learning for true adaptive traffic signal control , 2003 .

[7]  A. Koopman,et al.  Simulation and optimization of traffic in a city , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[8]  Montty Girianna,et al.  Using Genetic Algorithms to Design Signal Coordination for Oversaturated Networks , 2004, J. Intell. Transp. Syst..

[9]  Wei-Hua Lin,et al.  An enhanced 0-1 mixed-integer LP formulation for traffic signal control , 2004, IEEE Trans. Intell. Transp. Syst..

[10]  Wilfred W. Recker,et al.  Stochastic adaptive control model for traffic signal systems , 2006 .

[11]  Dipti Srinivasan,et al.  Neural Networks for Real-Time Traffic Signal Control , 2006, IEEE Transactions on Intelligent Transportation Systems.

[12]  Tao Li,et al.  Adaptive Dynamic Programming for Multi-intersections Traffic Signal Intelligent Control , 2008, 2008 11th International IEEE Conference on Intelligent Transportation Systems.

[13]  Shalabh Bhatnagar,et al.  Natural actor-critic algorithms , 2009, Autom..

[14]  Shalabh Bhatnagar,et al.  Natural actorcritic algorithms. , 2009 .

[15]  Dipti Srinivasan,et al.  Distributed Geometric Fuzzy Multiagent Urban Traffic Signal Control , 2010, IEEE Transactions on Intelligent Transportation Systems.

[16]  Javier J. Sánchez Medina,et al.  Traffic Signal Optimization in "La Almozara" District in Saragossa Under Congestion Conditions, Using Genetic Algorithms, Traffic Microsimulation, and Cluster Computing , 2010, IEEE Trans. Intell. Transp. Syst..

[17]  Shalabh Bhatnagar,et al.  Reinforcement Learning With Function Approximation for Traffic Signal Control , 2011, IEEE Transactions on Intelligent Transportation Systems.