Arterial traffic control using reinforcement learning agents and information from adjacent intersections in the state and reward structure

An application that uses reinforcement learning (RL) agents for traffic control along an arterial under high traffic volumes is presented. RL agents were trained using Q learning and a modified version of the state representation that included information on the occupancy of the links from neighboring intersections. The proposed structure also includes a reward that considers potential blockage from downstream intersections (due to saturated conditions), as well as pressure to coordinate the signal response with the future arrival of traffic from upstream intersections. Experiments using microscopic simulation software were conducted for an arterial with 5 intersections under high conflicting volumes, and results were compared with the best settings of coordinated pre-timed phasing. Data showed lower delays and less number of stops with RL agents, as well as a more balanced distribution of the delay among all vehicles in the system. Evidence of coordinated-like behavior was found as the number of stops to traverse the 5 intersections was on average lower than 1.5, and also since the distribution of green times from all intersections was very similar. As traffic approached to capacity, however, delays with the pre-timed phasing were lower than with RL agents, but the agents produced lower maximum delay times and lower maximum number of stops per vehicle. Future research will analyze variable coefficients in the state and reward structures for the system to better cope with a wide variety of traffic volumes, including transitions from oversaturation to undersaturation and vice versa.

[1]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[2]  Zhirui Ye,et al.  Development and Evaluation of a Multi-Agent Based Neuro-Fuzzy Arterial Traffic Signal Control System , 2007 .

[3]  E.H.J. Nijhuis,et al.  Cooperative multi-agent reinforcement learning of traffic lights , 2005 .

[4]  Baher Abdulhai,et al.  Reinforcement learning for true adaptive traffic signal control , 2003 .

[5]  Ana L. C. Bazzan,et al.  A Swarm-Based Approach for Selection of Signal Plans in Urban Scenarios , 2004, ANTS Workshop.

[6]  Ella Bingham Reinforcement learning in neurofuzzy traffic signal control , 2001, Eur. J. Oper. Res..

[7]  Yunlong Zhang,et al.  Development and evaluation of an arterial adaptive traffic signal control system using reinforcement learning , 2007 .

[8]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[9]  Thomas L. Thorpe Vehicle Traffic Light Control Using SARSA , 1997 .

[10]  Marco Wiering,et al.  Multi-Agent Reinforcement Learning for Traffic Light control , 2000 .

[11]  Victor R. Lesser,et al.  Using cooperative mediation to coordinate traffic lights: a case study , 2005, AAMAS '05.

[12]  Shimon Whiteson,et al.  Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs , 2008, ECML/PKDD.

[13]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[14]  Ella Bingham,et al.  Neurofuzzy Traffic Signal Control , 1998 .