Cordon control with spatially-varying metering rates: A Reinforcement Learning approach

Abstract The work explores how Reinforcement Learning can be used to re-time traffic signals around cordoned neighborhoods. An RL-based controller is developed by representing traffic states as graph-structured data and customizing corresponding neural network architectures to handle those data. The customizations enable the controller to: (i) model neighborhood-wide traffic based on directed-graph representations; (ii) use the representations to identify patterns in real-time traffic measurements; and (iii) capture those patterns to a spatial representation needed for selecting optimal cordon-metering rates. Input to the selection process also includes a total inflow to be admitted through a cordon. The rate is optimized in a separate process that is not part of the present work. Our RL-controller distributes that separately-optimized rate across the signalized street links that feed traffic through the cordon. The resulting metering rates vary from one feeder link to the next. The selection process can reoccur at short time intervals in response to changing traffic patterns. Once trained on a few cordons, the RL-controller can be deployed on cordons elsewhere in a city without additional training. This portability feature is confirmed via simulations of traffic on an idealized street network. The tests also indicate that the controller can reduce the network’s vehicle hours traveled well beyond what can be achieved via spatially-uniform cordon metering. The extra reductions in VHT are found to grow larger when traffic exhibits greater in-homogeneities over the network.

[1]  Nikolas Geroliminis,et al.  Dynamics of heterogeneity in urban networks: aggregated traffic modeling and hierarchical control , 2015 .

[2]  Nikolaos Geroliminis,et al.  On the spatial partitioning of urban transportation networks , 2012 .

[3]  Nikolaos Geroliminis,et al.  Clustering of Heterogeneous Networks with Directional Flows Based on “Snake” Similarities , 2016 .

[4]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[5]  Markos Papageorgiou,et al.  Congestion Control in Urban Networks via Feedback Gating , 2012 .

[6]  N. Geroliminis,et al.  An analytical approximation for the macropscopic fundamental diagram of urban traffic , 2008 .

[7]  Nikolaos Geroliminis,et al.  Perimeter and boundary flow control in multi-reservoir heterogeneous networks , 2013 .

[8]  Markos Papageorgiou,et al.  Controller Design for Gating Traffic Control in Presence of Time-delay in Urban Road Networks , 2015 .

[9]  Anna Nagurney,et al.  On a Paradox of Traffic Planning , 2005, Transp. Sci..

[10]  Vikash V. Gayah,et al.  Using Mobile Probe Data and the Macroscopic Fundamental Diagram to Estimate Network Densities , 2013 .

[11]  Nikolas Geroliminis,et al.  Multiple Concentric Gating Traffic Control in Large-Scale Urban Networks , 2015, IEEE Transactions on Intelligent Transportation Systems.

[12]  Lance Sherry,et al.  Accuracy of reinforcement learning algorithms for predicting aircraft taxi-out times: A case-study of Tampa Bay departures , 2010 .

[13]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  N. Geroliminis,et al.  Existence of urban-scale macroscopic fundamental diagrams: Some experimental findings - eScholarship , 2007 .

[15]  Nikolas Geroliminis,et al.  Optimal Perimeter Control for Two Urban Regions With Macroscopic Fundamental Diagrams: A Model Predictive Approach , 2013, IEEE Transactions on Intelligent Transportation Systems.

[16]  Carlos F. Daganzo,et al.  Adaptive offsets for signalized streets , 2017 .

[17]  Michel Tokic,et al.  Adaptive epsilon-Greedy Exploration in Reinforcement Learning Based on Value Difference , 2010, KI.

[18]  Konstantinos Ampountolas,et al.  Multi-gated perimeter flow control of transport networks , 2017, 2017 25th Mediterranean Conference on Control and Automation (MED).

[19]  Serge P. Hoogendoorn,et al.  Investigating the Shape of the Macroscopic Fundamental Diagram Using Simulation Data , 2010 .

[20]  Hesham Rakha,et al.  Deriving macroscopic fundamental diagrams from probe data: Issues and proposed solutions , 2016 .

[21]  Carlos F. Daganzo,et al.  Urban Gridlock: Macroscopic Modeling and Mitigation Approaches , 2007 .

[22]  Satish V. Ukkusuri,et al.  Accounting for dynamic speed limit control in a stochastic traffic environment: a reinforcement learning approach , 2014 .

[23]  Wei Ni,et al.  City-wide traffic control: Modeling impacts of cordon queues , 2020, Transportation Research Part C: Emerging Technologies.

[24]  C. Daganzo,et al.  Macroscopic relations of urban traffic variables: Bifurcations, multivaluedness and instability , 2011 .

[25]  Enrique F. Castillo,et al.  The Observability Problem in Traffic Network Models , 2008, Comput. Aided Civ. Infrastructure Eng..

[26]  Markos Papageorgiou,et al.  Queuing under perimeter control: Analysis and control strategy , 2016, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC).

[27]  Jack Haddad Optimal perimeter control synthesis for two urban regions with aggregate boundary queue dynamics , 2017 .

[28]  Chen Cai,et al.  Adaptive traffic signal control using approximate dynamic programming , 2009 .

[29]  Nikolaos Geroliminis,et al.  Enhancing model-based feedback perimeter control with data-driven online adaptive optimization , 2017 .