A junction-tree based learning algorithm to optimize network wide traffic control: A coordinated multi-agent framework

This study develops a novel reinforcement learning algorithm for the challenging coordinated signal control problem. Traffic signals are modeled as intelligent agents interacting with the stochastic traffic environment. The model is built on the framework of coordinated reinforcement learning. The Junction Tree Algorithm (JTA) based reinforcement learning is proposed to obtain an exact inference of the best joint actions for all the coordinated intersections. The algorithm is implemented and tested with a network containing 18 signalized intersections in VISSIM. Results show that the JTA based algorithm outperforms independent learning (Q-learning), real-time adaptive learning, and fixed timing plans in terms of average delay, number of stops, and vehicular emissions at the network level.

[1]  Jaeyoung Kwak,et al.  Evaluating the impacts of urban corridor traffic signal optimization on vehicle emissions and fuel consumption , 2012 .

[2]  Hong K. Lo,et al.  A Cell-Based Traffic Control Formulation: Strategies and Benefits of Dynamic Timing Plans , 2001, Transp. Sci..

[3]  G. F Newell THE ROLLING HORIZON SCHEME OF TRAFFIC SIGNAL CONTROL , 1998 .

[4]  P R Lowrie,et al.  The Sydney coordinated adaptive traffic system - principles, methodology, algorithms , 1982 .

[5]  Yukinori Kakazu,et al.  Genetic reinforcement learning for cooperative traffic signal control , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[6]  Abhijit Gosavi,et al.  Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning , 2003 .

[7]  Sophie Midenet,et al.  The real-time urban traffic control system CRONOS: Algorithm and experiments , 2006 .

[8]  Yan Li,et al.  Urban Traffic Signal Control Network Partitioning Using Self-Organizing Maps , 2011 .

[9]  Markos Papageorgiou,et al.  A multivariable regulator approach to traffic-responsive network-wide signal control , 2000 .

[10]  D N Tudor USE OF UNRESTRICTED FEDERAL FUNDS IN THE U.S. DOT/FHWA (DEPARTMENT OF TRANSPORTATION/FEDERAL HIGHWAY ADMINISTRATION SECTION 18 PROGRAM. FINAL REPORT , 1983 .

[11]  Baher Abdulhai,et al.  Reinforcement learning for true adaptive traffic signal control , 2003 .

[12]  Will Recker,et al.  A Mathematical Logic Approach for the Transformation of the Linear Conditional Piecewise Functions of Dispersion-and-Store and Cell Transmission Traffic Flow Models into Linear Mixed-Integer Form , 2009, Transp. Sci..

[13]  Juan C. Medina,et al.  Traffic signal control using reinforcement learning and the max-plus algorithm as a coordinating strategy , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[14]  Markos Papageorgiou,et al.  Store-and-forward based methods for the signal control problem in large-scale congested urban road networks , 2009 .

[15]  R D Bretherton,et al.  THE SCOOT ON-LINE TRAFFIC SIGNAL OPTIMISATION TECHNIQUE , 1982 .

[16]  Adel W. Sadek,et al.  Assessing the Mobility and Environmental Benefits of Reservation-Based Intelligent Intersections Using an Integrated Simulator , 2012, IEEE Transactions on Intelligent Transportation Systems.

[17]  Ana L. C. Bazzan,et al.  Learning in groups of traffic signals , 2010, Eng. Appl. Artif. Intell..

[18]  Pitu B. Mirchandani,et al.  A REAL-TIME TRAFFIC SIGNAL CONTROL SYSTEM: ARCHITECTURE, ALGORITHMS, AND ANALYSIS , 2001 .

[19]  Bart De Schutter,et al.  Model predictive control for optimal coordination of ramp metering and variable speed limits , 2005 .

[20]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[21]  Martin J. Wainwright,et al.  Tree consistency and bounds on the performance of the max-product algorithm and its generalizations , 2004, Stat. Comput..

[22]  Victor R. Lesser,et al.  Using cooperative mediation to coordinate traffic lights: a case study , 2005, AAMAS '05.

[23]  Baher Abdulhai,et al.  Towards multi-agent reinforcement learning for integrated network of optimal traffic controllers (MARLIN-OTC) , 2010 .

[24]  Ali A. Ghorbani,et al.  A multiagent system for optimizing urban traffic , 2003, IEEE/WIC International Conference on Intelligent Agent Technology, 2003. IAT 2003..

[25]  Athanasios K. Ziliaskopoulos,et al.  System Optimal Signal Optimization Formulation , 2006 .

[26]  Li-Wen Chen,et al.  Traffic Signal Optimization with Greedy Randomized Tabu Search Algorithm , 2012 .

[27]  Ana L. C. Bazzan,et al.  A Distributed Approach for Coordination of Traffic Signal Agents , 2004, Autonomous Agents and Multi-Agent Systems.

[28]  Wei-Hua Lin,et al.  An enhanced 0-1 mixed-integer LP formulation for traffic signal control , 2004, IEEE Trans. Intell. Transp. Syst..

[29]  Ana L. C. Bazzan,et al.  Traffic Lights Control with Adaptive Group Formation Based on Swarm Intelligence , 2006, ANTS Workshop.

[30]  Nathan H. Gartner,et al.  OPAC: A DEMAND-RESPONSIVE STRATEGY FOR TRAFFIC SIGNAL CONTROL , 1983 .

[31]  Jean-Loup Farges,et al.  THE PRODYN REAL TIME TRAFFIC ALGORITHM , 1983 .

[32]  Satish V. Ukkusuri,et al.  Unified Framework for Dynamic Traffic Assignment and Signal Control with Cell Transmission Model , 2012 .

[33]  A. Koopman,et al.  Simulation and optimization of traffic in a city , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[34]  Itamar Elhanany,et al.  A Novel Signal-Scheduling Algorithm With Quality-of-Service Provisioning for an Isolated Intersection , 2008, IEEE Transactions on Intelligent Transportation Systems.

[35]  Nikos A. Vlassis,et al.  Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..

[36]  Peter A. Flach,et al.  Evaluation Measures for Multi-class Subgroup Discovery , 2009, ECML/PKDD.