Recursive least-squares temporal difference learning for adaptive traffic signal control at intersection

This paper presents a new method to solve the scheduling problem of adaptive traffic signal control at intersection. The method involves recursive least-squares temporal difference (RLS-TD(λ)) learning that is integrated into approximate dynamic programming. The learning mechanism of RLS-TD(λ) is to make an adaptation of linear function approximation by updating its parameters based on environmental feedback. This study investigates the method implementation after modeling a traffic dynamic system at intersection in discrete time. In the model, different traffic control schemes regarding signal phase sequence are considered, especially the defined adaptive phase sequence (APS). By simulating traffic scenarios, RLS-TD(λ) is superior to TD(λ) for updating functional parameters in the approximation, and APS outperforms other conventional control schemes on reducing traffic delay. By comparing with other traffic signal control algorithms, the proposed algorithm yields satisfying results in terms of traffic delay and computation time.

[1]  Xin Xu,et al.  Reinforcement learning algorithms with function approximation: Recent advances and applications , 2014, Inf. Sci..

[2]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[3]  Aleksandar Stevanovic,et al.  Adaptive Traffic Control Systems: Guidelines for Development of Functional Requirements , 2015 .

[4]  Shalabh Bhatnagar,et al.  Reinforcement Learning With Function Approximation for Traffic Signal Control , 2011, IEEE Transactions on Intelligent Transportation Systems.

[5]  Abbas Khosravi,et al.  A review on computational intelligence methods for controlling traffic signal timing , 2015, Expert Syst. Appl..

[6]  Li Li,et al.  Traffic signal timing via deep reinforcement learning , 2016, IEEE/CAA Journal of Automatica Sinica.

[7]  Mohamed A. Khamis,et al.  Adaptive multi-objective reinforcement learning with hybrid exploration for traffic signal control based on cooperative multi-agent framework , 2014, Eng. Appl. Artif. Intell..

[8]  P R Lowrie,et al.  The Sydney coordinated adaptive traffic system - principles, methodology, algorithms , 1982 .

[9]  Nathan H. Gartner,et al.  Implementation of the OPAC adaptive control strategy in a traffic signal network , 2001, ITSC 2001. 2001 IEEE Intelligent Transportation Systems. Proceedings (Cat. No.01TH8585).

[10]  Dongbin Zhao,et al.  Full-range adaptive cruise control based on supervised adaptive dynamic programming , 2014, Neurocomputing.

[11]  H. He,et al.  Efficient Reinforcement Learning Using Recursive Least-Squares Methods , 2011, J. Artif. Intell. Res..

[12]  José García-Nieto,et al.  Swarm intelligence for traffic light scheduling: Application to real urban areas , 2012, Eng. Appl. Artif. Intell..

[13]  Andrew W. Moore,et al.  Gradient Descent for General Reinforcement Learning , 1998, NIPS.

[14]  Andrew G. Barto,et al.  Linear Least-Squares Algorithms for Temporal Difference Learning , 2005, Machine Learning.

[15]  Wilfred W. Recker,et al.  Stochastic adaptive control model for traffic signal systems , 2006 .

[16]  Abdellah El Moudni,et al.  Discrete Methods for Urban Intersection Traffic Controlling , 2009, VTC Spring 2009 - IEEE 69th Vehicular Technology Conference.

[17]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[18]  Jean-Loup Farges,et al.  THE PRODYN REAL TIME TRAFFIC ALGORITHM , 1983 .

[19]  Baher Abdulhai,et al.  Real-Time Optimization for Adaptive Traffic Signal Control Using Genetic Algorithms , 2005, J. Intell. Transp. Syst..

[20]  Justin A. Boyan,et al.  Technical Update: Least-Squares Temporal Difference Learning , 2002, Machine Learning.

[21]  Ben Waterson,et al.  An automated signalized junction controller that learns strategies by temporal difference reinforcement learning , 2013, Eng. Appl. Artif. Intell..

[22]  Paul J. Werbos,et al.  Approximate dynamic programming for real-time control and neural modeling , 1992 .

[23]  Abdellah El Moudni,et al.  Forward search algorithm based on dynamic programming for real-time adaptive traffic signal control , 2015 .

[24]  Dipti Srinivasan,et al.  Neural Networks for Real-Time Traffic Signal Control , 2006, IEEE Transactions on Intelligent Transportation Systems.

[25]  Dirk Ormoneit,et al.  Kernel-Based Reinforcement Learning , 2017, Encyclopedia of Machine Learning and Data Mining.

[26]  Baher Abdulhai,et al.  Design of Reinforcement Learning Parameters for Seamless Application of Adaptive Traffic Signal Control , 2014, J. Intell. Transp. Syst..

[27]  Pitu B. Mirchandani,et al.  A REAL-TIME TRAFFIC SIGNAL CONTROL SYSTEM: ARCHITECTURE, ALGORITHMS, AND ANALYSIS , 2001 .

[28]  Yu-Fai Fung,et al.  Coordinated road-junction traffic control by dynamic programming , 2005, IEEE Trans. Intell. Transp. Syst..

[29]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[30]  Ana L. C. Bazzan,et al.  Opportunities for multiagent systems and multiagent reinforcement learning in traffic control , 2009, Autonomous Agents and Multi-Agent Systems.

[31]  Peter T. Martin,et al.  Comparative Evaluation of Adaptive Traffic Control System Assessments Through Field and Microsimulation , 2010, J. Intell. Transp. Syst..

[32]  Frank L. Lewis,et al.  Reinforcement learning and optimal adaptive control: An overview and implementation examples , 2012, Annu. Rev. Control..

[33]  MengChu Zhou,et al.  Modular Design of Urban Traffic-Light Control Systems Based on Synchronized Timed Petri Nets , 2014, IEEE Transactions on Intelligent Transportation Systems.

[34]  Tao Li,et al.  Adaptive Dynamic Programming for Multi-intersections Traffic Signal Intelligent Control , 2008, 2008 11th International IEEE Conference on Intelligent Transportation Systems.

[35]  Huaguang Zhang,et al.  Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.

[36]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[37]  Mohamed A. Khamis,et al.  Enhanced multiagent multi-objective reinforcement learning for urban traffic light control , 2012, 2012 11th International Conference on Machine Learning and Applications.

[38]  T. Söderström,et al.  Instrumental variable methods for system identification , 1983 .

[39]  Philip J Tarnoff,et al.  EVALUATION OF OPTIMIZED POLICIES FOR ADAPTIVE CONTROL STRATEGY , 1991 .

[40]  Jan van der Wal,et al.  AN MDP DECOMPOSITION APPROACH FOR TRAFFIC CONTROL AT ISOLATED SIGNALIZED INTERSECTIONS , 2008, Probability in the Engineering and Informational Sciences.

[41]  Baher Abdulhai,et al.  Reinforcement learning for true adaptive traffic signal control , 2003 .

[42]  T. Urbanik,et al.  Reinforcement learning-based multi-agent system for network traffic signal control , 2010 .

[43]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[44]  R D Bretherton,et al.  SCOOT-a Traffic Responsive Method of Coordinating Signals , 1981 .

[45]  Chen Cai,et al.  Adaptive traffic signal control using approximate dynamic programming , 2009 .

[46]  Myungsoon Chang,et al.  Realizing Benefits of Adaptive Signal Control at an Isolated Intersection , 2002 .

[47]  Baher Abdulhai,et al.  Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and Large-Scale Application on Downtown Toronto , 2013, IEEE Transactions on Intelligent Transportation Systems.