MARL-FWC: Optimal Coordination of Freeway Traffic Control Measures

The objective of this article is to optimize the overall traffic flow on freeways using multiple ramp metering controls plus its complementary Dynamic Speed Limits (DSLs). An optimal freeway operation can be reached when minimizing the difference between the freeway density and the critical ratio for maximum traffic flow. In this article, a Multi-Agent Reinforcement Learning for Freeways Control (MARL-FWC) system for ramps metering and DSLs is proposed. MARL-FWC introduces a new microscopic framework at the network level based on collaborative Markov Decision Process modeling (Markov game) and an associated cooperative Q-learning algorithm. The technique incorporates payoff propagation (Max-Plus algorithm) under the coordination graphs framework, particularly suited for optimal control purposes. MARL-FWC provides three control designs: fully independent, fully distributed, and centralized; suited for different network architectures. MARL-FWC was extensively tested in order to assess the proposed model of the joint payoff, as well as the global payoff. Experiments are conducted with heavy traffic flow under the renowned VISSIM traffic simulator to evaluate MARL-FWC. The experimental results show a significant decrease in the total travel time and an increase in the average speed (when compared with the base case) while maintaining an optimal traffic flow.

[1]  Walid Gomaa,et al.  Multi-Agent Reinforcement Learning Control for Ramp Metering , 2014, ICSEng.

[2]  Markos Papageorgiou,et al.  ALINEA: A LOCAL FEEDBACK CONTROL LAW FOR ON-RAMP METERING , 1990 .

[3]  Xinrong Liang,et al.  Freeway ramp control based on single neuron , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[4]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[5]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[6]  Mohamed A. Khamis,et al.  Comparative assessment of machine-learning scoring functions on PDBbind 2013 , 2015, Eng. Appl. Artif. Intell..

[7]  Mohamed A. Khamis,et al.  Adaptive multi-objective reinforcement learning with hybrid exploration for traffic signal control based on cooperative multi-agent framework , 2014, Eng. Appl. Artif. Intell..

[8]  Xinrong Liang,et al.  Single Neuron Based Freeway Traffic Density Control via Ramp Metering , 2010, 2010 2nd International Conference on Information Engineering and Computer Science.

[9]  Louis Wehenkel,et al.  Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[10]  Li Jian,et al.  Design of Fuzzy Neural Network Control Method for Ramp Metering , 2011, 2011 Third International Conference on Measuring Technology and Mechatronics Automation.

[11]  Carlos Guestrin,et al.  Multiagent Planning with Factored MDPs , 2001, NIPS.

[12]  Mohamed A. Khamis,et al.  Enhanced multiagent multi-objective reinforcement learning for urban traffic light control , 2012, 2012 11th International Conference on Machine Learning and Applications.

[13]  D. Koller,et al.  Planning under uncertainty in complex structured environments , 2003 .

[14]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[15]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[16]  Walid Gomaa,et al.  Multi-objective traffic light control system based on Bayesian probability interpretation , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[17]  Martin J. Wainwright,et al.  Tree consistency and bounds on the performance of the max-product algorithm and its generalizations , 2004, Stat. Comput..

[18]  W. L. Xu,et al.  Genetic fuzzy logic approach to local ramp metering control using microscopic traffic simulation , 2012, 2012 19th International Conference on Mechatronics and Machine Vision in Practice (M2VIP).

[19]  Helmut Prendinger,et al.  Tokyo Virtual Living Lab: Designing Smart Cities Based on the 3D Internet , 2013, IEEE Internet Computing.

[20]  Walid Gomaa,et al.  Freeway ramp-metering control based on Reinforcement learning , 2014, 11th IEEE International Conference on Control & Automation (ICCA).

[21]  Walid Gomaa,et al.  Machine learning in computational docking , 2015, Artif. Intell. Medicine.

[22]  Nikos A. Vlassis,et al.  Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..

[23]  M.A. Khamis,et al.  Adaptive traffic control system based on Bayesian probability interpretation , 2012, 2012 Japan-Egypt Conference on Electronics, Communications and Computers.

[24]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[25]  Walid Gomaa,et al.  Car Following Markov Regime Classification and Calibration , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[26]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[27]  Nikos A. Vlassis,et al.  Anytime algorithms for multiagent decision making using coordination graphs , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[28]  Ashkan Rahimi-Kian,et al.  Adaptive freeway ramp metering and variable speed limit control: a genetic-fuzzy approach , 2009, IEEE Intelligent Transportation Systems Magazine.