A modified reinforcement learning algorithm for solving coordinated signalized networks

This study proposes Reinforcement Learning (RL) based algorithm for finding optimum signal timings in Coordinated Signalized Networks (CSN) for fixed set of link flows. For this purpose, MOdified REinforcement Learning algorithm with TRANSYT-7F (MORELTRANS) model is proposed by way of combining RL algorithm and TRANSYT-7F. The modified RL differs from other RL algorithms since it takes advantage of the best solution obtained from the previous learning episode by generating a sub-environment at each learning episode as the same size of original environment. On the other hand, TRANSYT-7F traffic model is used in order to determine network performance index, namely disutility index. Numerical application is conducted on medium sized coordinated signalized road network. Results indicated that the MORELTRANS produced slightly better results than the GA in signal timing optimization in terms of objective function value while it outperformed than the HC. In order to show the capability of the proposed model for heavy demand condition, two cases in which link flows are increased by 20% and 50% with respect to the base case are considered. It is found that the MORELTRANS is able to reach good solutions for signal timing optimization even if demand became increased.

[1]  Suh-Wen Chiou TRANSYT derivatives for area traffic control optimisation with network equilibrium flows , 2003 .

[2]  Halim Ceylan,et al.  Developing Combined Genetic Algorithm—Hill-Climbing Optimization Method for Area Traffic Control , 2006 .

[3]  Juan Chen,et al.  Road-Junction Traffic Signal Timing Optimization by an adaptive Particle Swarm Algorithm , 2006, 2006 9th International Conference on Control, Automation, Robotics and Vision.

[4]  T. Urbanik,et al.  Reinforcement learning-based multi-agent system for network traffic signal control , 2010 .

[5]  Shingo Mabu,et al.  A genetic network programming with learning approach for enhanced stock trading model , 2009, Expert Syst. Appl..

[6]  Baher Abdulhai,et al.  Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and Large-Scale Application on Downtown Toronto , 2013, IEEE Transactions on Intelligent Transportation Systems.

[7]  Davy Janssens,et al.  Simulation of sequential data: An enhanced reinforcement learning approach , 2009, Expert Syst. Appl..

[8]  Rahim F Benekohal,et al.  Application of genetic algorithms to generate optimum signal coordination for congested networks , 2002 .

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Xinkai Wu,et al.  Managing oversaturated signalized arterials: A maximum flow based approach , 2013 .

[11]  Michael G.H. Bell,et al.  Traffic signal timing optimisation based on genetic algorithm approach, including drivers’ routing , 2004 .

[12]  Suh-Wen Chiou Optimization of robust area traffic control with equilibrium flow under demand uncertainty , 2014, Comput. Oper. Res..

[13]  Jun Ding,et al.  PAMSCOD: Platoon-based arterial multi-modal signal control with online data , 2011 .

[14]  Pengcheng Zhang,et al.  A novel multi-agent reinforcement learning approach for job scheduling in Grid computing , 2011, Future Gener. Comput. Syst..

[15]  Peter G Furth,et al.  Self-Organizing Traffic Signals Using Secondary Extension and Dynamic Coordination Rules , 2014 .

[16]  Eduardo Camponogara,et al.  Distributed Learning Agents in Urban Traffic Control , 2003, EPIA.

[17]  Shing Chung Josh Wong,et al.  Group-based optimization of a time-dependent TRANSYT traffic model for area traffic control , 2002 .

[18]  Michael G.H. Bell,et al.  Genetic algorithm solution for the stochastic equilibrium transportation networks under congestion , 2005 .

[19]  Huseyin Ceylan,et al.  A Hybrid Harmony Search and TRANSYT hill climbing algorithm for signalized stochastic equilibrium tr , 2012 .

[20]  Benjamin Heydecker A decomposition approach for signal optimisation in road networks , 1996 .

[21]  Yafeng Yin,et al.  Robust signal timing optimization with environmental concerns , 2013 .

[22]  Baher Abdulhai,et al.  An agent-based learning towards decentralized and coordinated traffic signal control , 2010, 13th International IEEE Conference on Intelligent Transportation Systems.

[23]  Ella Bingham Reinforcement learning in neurofuzzy traffic signal control , 2001, Eur. J. Oper. Res..

[24]  Shing Chung Josh Wong Group-based optimisation of signal timings using parallel computing , 1997 .

[25]  Satish V. Ukkusuri,et al.  A junction-tree based learning algorithm to optimize network wide traffic control: A coordinated multi-agent framework , 2015 .

[26]  Soner Haldenbilen,et al.  Solving Network Design Problem with Dynamic Network Loading Profiles Using Modified Reinforcement Learning Method , 2014 .

[27]  Zichuan Li,et al.  Modeling Arterial Signal Optimization with Enhanced Cell Transmission Formulations , 2011 .

[28]  Ozgur Baskan,et al.  A new solution algorithm for improving performance of ant colony optimization , 2009, Appl. Math. Comput..

[29]  Shing Chung Josh Wong,et al.  Derivatives of the performance index for the traffic model from TRANSYT , 1995 .

[30]  M. Maher,et al.  Signal optimisation using the cross entropy method , 2013 .

[31]  Guangzhou Zeng,et al.  Study of genetic algorithm with reinforcement learning to solve the TSP , 2009, Expert Syst. Appl..

[32]  Wilfried Brauer,et al.  Fuzzy Model-Based Reinforcement Learning , 2002, Advances in Computational Intelligence and Learning.

[33]  Mauro Dell'Orco,et al.  Artificial Bee Colony-Based Algorithm for Optimising Traffic Signal Timings , 2014 .

[34]  Javier de Lope,et al.  Hybridizing evolutionary computation and reinforcement learning for the design of almost universal controllers for autonomous robots , 2009 .

[35]  Shing Chung Josh Wong,et al.  Group-based optimisation of signal timings using the TRANSYT traffic model , 1996 .

[36]  Pravin Varaiya,et al.  Max pressure control of a network of signalized intersections , 2013 .

[37]  Xiaohong Gao,et al.  Study on Intelligent Control of Traffic Signal of Urban Area and Microscopic Simulation , 2009 .

[38]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[39]  Mauro Dell'Orco,et al.  A Harmony Search Algorithm approach for optimizing traffic signal timings , 2013 .

[40]  Baher Abdulhai,et al.  Reinforcement learning: Introduction to theory and potential for transport applications , 2003 .

[41]  Ana L. C. Bazzan,et al.  Learning in groups of traffic signals , 2010, Eng. Appl. Artif. Intell..

[42]  Henry X. Liu,et al.  SMART-Signal Phase II: Arterial Offset Optimization Using Archived High-Resolution Traffic Signal Data , 2013 .

[43]  Nathan H. Gartner,et al.  Robust controls for traffic networks: The near-Bayes near-Minimax strategy , 2013 .

[44]  Jun Ding,et al.  Multi-modal traffic signal control with priority, signal actuation and coordination , 2014 .

[45]  Chen Cai,et al.  Adaptive traffic signal control using approximate dynamic programming , 2009 .

[46]  Yue Liu,et al.  An arterial signal optimization model for intersections experiencing queue spillback and lane blockage , 2011 .

[47]  Baher Abdulhai,et al.  Reinforcement learning for true adaptive traffic signal control , 2003 .