Traffic flow optimization: A reinforcement learning approach

Traffic congestion causes important problems such as delays, increased fuel consumption and additional pollution. In this paper we propose a new method to optimize traffic flow, based on reinforcement learning. We show that a traffic flow optimization problem can be formulated as a Markov Decision Process. We use Q-learning to learn policies dictating the maximum driving speed that is allowed on a highway, such that traffic congestion is reduced. An important difference between our work and existing approaches is that we take traffic predictions into account. A series of simulation experiments shows that the resulting policies significantly reduce traffic congestion under high traffic demand, and that inclusion of traffic predictions improves the quality of the resulting policies. Additionally, the policies are sufficiently robust to deal with inaccurate speed and density measurements. HighlightsWe model a traffic flow optimization problem as a reinforcement learning problem.We show how speed limit policies can be obtained using Q-learning.Neural networks improve the performance of our policy learning algorithm.Resulting policies are able to significantly reduce traffic congestion.Our method takes traffic predictions into account and controls proactively.

[1]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[2]  Jianlong Zhang,et al.  A simple roadway control system for freeway traffic , 2006, 2006 American Control Conference.

[3]  Shimon Whiteson,et al.  Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs , 2008, ECML/PKDD.

[4]  Leonardo L. B. V. Cruciol,et al.  Reward functions for learning to control in air traffic flow management , 2013 .

[5]  Markos Papageorgiou,et al.  METANET: A MACROSCOPIC SIMULATION PROGRAM FOR MOTORWAY NETWORKS , 1990 .

[6]  Dongbin Zhao,et al.  Computational Intelligence in Urban Traffic Signal Control: A Survey , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[7]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Mohamed A. Khamis,et al.  Adaptive multi-objective reinforcement learning with hybrid exploration for traffic signal control based on cooperative multi-agent framework , 2014, Eng. Appl. Artif. Intell..

[10]  Andreas Hegyi,et al.  Integrated traffic control for mixed urban and freeway networks: a model predictive control approach , 2007 .

[11]  Kagan Tumer,et al.  Distributed agent-based air traffic flow management , 2007, AAMAS '07.

[12]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[13]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[14]  Lino Figueiredo,et al.  Research issues in intelligent transportation systems , 2001, 2001 European Control Conference (ECC).

[15]  Ali Selamat,et al.  Modeling of route planning system based on Q value-based dynamic programming with multi-agent reinforcement learning algorithms , 2014, Eng. Appl. Artif. Intell..

[16]  Markos Papageorgiou,et al.  Modelling and real-time control of traffic flow on the southern part of Boulevard Peripherique in Paris: Part I: Modelling , 1990 .

[17]  James C. Spall,et al.  A model-free approach to optimal signal light timing for system-wide traffic control , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.

[18]  Wu Wei,et al.  FL-FN based traffic signal control , 2002, 2002 IEEE World Congress on Computational Intelligence. 2002 IEEE International Conference on Fuzzy Systems. FUZZ-IEEE'02. Proceedings (Cat. No.02CH37291).

[19]  Weina Lu,et al.  A Synchronous Detection of the Road Boundary and Lane Marking for Intelligent Vehicles , 2007, Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007).

[20]  J.L. Martins de Carvalho,et al.  Towards the development of intelligent transportation systems , 2001, ITSC 2001. 2001 IEEE Intelligent Transportation Systems. Proceedings (Cat. No.01TH8585).

[21]  George M. Siouris,et al.  Applied Optimal Control: Optimization, Estimation, and Control , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[22]  Michael J Demetsky,et al.  SHORT-TERM TRAFFIC FLOW PREDICTION: NEURAL NETWORK APPROACH , 1994 .

[23]  B. De Schutter,et al.  Integrated model predictive control for mixed urban and freeway networks , 2004 .

[24]  Pravin Varaiya,et al.  Two Proposals to Improve Freeway Traffic Flow , 1991, 1991 American Control Conference.

[25]  Ying Sun,et al.  A Provably Secure ID-Based Mediated Signcryption Scheme , 2007 .

[26]  ZhangZhen,et al.  Computational Intelligence in Urban Traffic Signal Control , 2012 .

[27]  Anna Helena Reali Costa,et al.  Optimal control of ship unloaders using reinforcement learning , 2002, Adv. Eng. Informatics.

[28]  M. Papageorgiou,et al.  Effects of Variable Speed Limits on Motorway Traffic Flow , 2008 .

[29]  D. Schrank,et al.  2001 Urban Mobility Report , 2001 .

[30]  M. Papageorgiou MODELLING AND REAL-TIME CONTROL OF TRAFFIC FLOW ON THE SOUTHERN PART OF BOULEVARD PERIPHERIQUE DE PARIS , 1988 .

[31]  Walid Gomaa,et al.  Multi-Agent Reinforcement Learning Control for Ramp Metering , 2014, ICSEng.

[32]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[33]  Markos Papageorgiou,et al.  Traffic flow modeling of large-scale motorwaynetworks using the macroscopic modeling tool METANET , 2002, IEEE Trans. Intell. Transp. Syst..

[34]  Long Ji Lin,et al.  Reinforcement Learning of Non-Markov Decision Processes , 1995, Artif. Intell..

[35]  Baher Abdulhai,et al.  Application of reinforcement learning with continuous state space to ramp metering in real-world conditions , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[36]  Serge P. Hoogendoorn,et al.  An Automated Calibration Procedure for Macroscopic Traffic Flow Models , 2003 .

[37]  Andreas Hegyi,et al.  Motorway ramp-metering control with queuing consideration using Q-learning , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[38]  Yorgos J. Stephanedes,et al.  COMPARATIVE EVALUATION OF ADAPTIVE AND NEURAL-NETWORK EXIT DEMAND PREDICTION FOR FREEWAY CONTROL , 1994 .

[39]  Daniel Krajzewicz,et al.  SUMO - Simulation of Urban MObility An Overview , 2011 .

[40]  Markos Papageorgiou,et al.  Applications of Automatic Control Concepts to Traffic Flow Modeling and Control , 1983 .

[41]  E. Chung,et al.  SHORT TERM TRAFFIC FLOW PREDICTION , 2001 .

[42]  Wang,et al.  Review of road traffic control strategies , 2003, Proceedings of the IEEE.

[43]  David Hsu,et al.  Intention-aware online POMDP planning for autonomous driving in a crowd , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[44]  Michael Wooldridge,et al.  Applications of intelligent agents , 1998 .

[45]  Monireh Abdoos,et al.  Holonic multi-agent system for traffic signals control , 2013, Eng. Appl. Artif. Intell..

[46]  Walid Gomaa,et al.  Freeway ramp-metering control based on Reinforcement learning , 2014, 11th IEEE International Conference on Control & Automation (ICCA).

[47]  Mohamed A. Khamis,et al.  Enhanced multiagent multi-objective reinforcement learning for urban traffic light control , 2012, 2012 11th International Conference on Machine Learning and Applications.