An indirect reinforcement learning approach for ramp control under incident-induced congestion

Incident-induced congestion is one of the main causes for delays on motorways. Strategies for managing such congestion using traffic control technologies can be classified into model-based and model-free methods. Both methods possess their own merits but also have drawbacks. Dyna-Q architecture is a method that can combine model-free learning and model-based planning together to obtain the benefits from both sides. Based on the Dyna-Q architecture, an indirect reinforcement learning (IRL) approach is derived in this study. The new method is compared with two other methods, namely DRL and ALINEA. Simulation experiment results show that, with suitable weight values, IRL can achieve a superior performance in many scenarios. Moreover, compared with DRL, IRL has a much faster learning speed.

[1]  H. M. Zhang,et al.  Some general results on the optimal ramp control problem , 1996 .

[2]  Stephen G. Ritchie,et al.  Development and evaluation of a knowledge-based system for traffic congestion management and control , 2001 .

[3]  H. Payne,et al.  Freeway ramp metering strategies for responding to incidents , 1977, 1977 IEEE Conference on Decision and Control including the 16th Symposium on Adaptive Processes and A Special Symposium on Fuzzy Set Theory and Applications.

[4]  Mei-Shiang Chang,et al.  Stochastic Optimal-Control Approach to Automatic Incident-Responsive Coordinated Ramp Control , 2007, IEEE Transactions on Intelligent Transportation Systems.

[5]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[6]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7]  Andreas Hegyi,et al.  Motorway ramp-metering control with queuing consideration using Q-learning , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[8]  Markos Papageorgiou,et al.  ALINEA: A LOCAL FEEDBACK CONTROL LAW FOR ON-RAMP METERING , 1990 .

[9]  Ramkumar Venkatanarayana,et al.  CHARACTERIZATION OF FREEWAY CAPACITY REDUCTION RESULTING FROM TRAFFIC ACCIDENTS , 2003 .

[10]  Richard S. Sutton,et al.  Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.

[11]  Mu-Han Wang Optimal ramp metering policies for nonrecurring congestion with uncertain incident duration , 1994 .

[12]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[13]  Liping Fu,et al.  Real-Time Estimation of Incident Delay in Dynamic and Stochastic Networks , 1997 .

[14]  Mohammed Hadi,et al.  Modeling Reductions in Freeway Capacity due to Incidents in Microscopic Simulation Models , 2007 .

[15]  Carlos F. Daganzo,et al.  The Spatial Evolution of Queues During the Morning Commute in a Single Corridor , 1993 .

[16]  Kaan Ozbay,et al.  INCIDENT MANAGEMENT IN INTELLIGENT TRANSPORTATION SYSTEMS , 1999 .

[17]  Baher Abdulhai,et al.  Application of reinforcement learning with continuous state space to ramp metering in real-world conditions , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[18]  Robert B. Jacko,et al.  Stochastic Model for Estimating Impact of Highway Incidents on Air Pollution and Traffic Delay , 2007 .

[19]  W. Marsden I and J , 2012 .

[20]  Steven Broekx,et al.  Modelling instantaneous traffic emission and the influence of traffic speed limits. , 2006, The Science of the total environment.

[21]  Asad J. Khattak,et al.  Spatiotemporal Patterns of Primary and Secondary Incidents on Urban Freeways , 2011 .

[22]  Wang,et al.  Review of road traffic control strategies , 2003, Proceedings of the IEEE.

[23]  Baher Abdulhai,et al.  Machine learning for multi-jurisdictional optimal traffic corridor control , 2010 .

[24]  Bill Halkias,et al.  Freeway Incidents in the United States, United Kingdom, and Attica Tollway, Greece , 2008 .