Deep Q learning-based traffic signal control algorithms: Model development and evaluation with field data

Abstract To contend traffic congestion on urban networks, existing studies have made great efforts to develop traffic-responsive signal timing algorithms in the last decade. More recently, as an alternative to conventional model-based algorithms, machine learning-based methods have been tested on traffic light timing problems and show promising potentials. However, many researchers and practitioners still questioned the feasibility and applicability of adopting machine learning techniques in the ATSC domain. One of the reasons is that these methods assumed flawless detectors and heavily relied on simulators for training and evaluations. To address such a critical concern, this article customizes a Deep Q-learning Learning (DQL) method to optimize traffic signal timings at urban intersections, where the partial observations from identity-based detectors are inputs, and the green splits are outputs. A simulation-free data-driven prediction model is also developed to train the DQL with reduced computational time. Then the machine learning-based methods are evaluated on a real-world case with Automatic Number-Plate Recognition (ANPR) data. Experiments show the proposed data-driven model can predict the traffic state in limited computational time, and the DQL algorithm is 3.9% better than the field experiment performance from the adaptive control system, SCOOT, and 22% better than the time-of-day plan by SYNCHRO. The results indicate the DQL methods can only yield marginal improvement with restrictive input and output settings in congested traffic flow in comparison to the conventional adaptive method.

[1]  Chenfeng Xiong,et al.  An integrated modeling framework for active traffic management and its applications in the Washington, DC area , 2021, Journal of Intelligent Transportation Systems.

[2]  Lixin Miao,et al.  Adaptive traffic signal control algorithms based on probe vehicle data , 2021, J. Intell. Transp. Syst..

[3]  Dongfang Ma,et al.  A decentralized model predictive traffic signal control method with fixed phase sequence for urban networks , 2020, J. Intell. Transp. Syst..

[4]  Martin Gregurić,et al.  Application of Deep Reinforcement Learning in Traffic Signal Control: An Overview and Impact of Open Traffic Data , 2020, Applied Sciences.

[5]  Qionghai Dai,et al.  Cooperative Deep Reinforcement Learning for Large-Scale Traffic Grid Signal Control , 2020, IEEE Transactions on Cybernetics.

[6]  Monireh Abdoos,et al.  Experience classification for transfer learning in traffic signal control , 2020, The Journal of Supercomputing.

[7]  Tianshu Chu,et al.  Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control , 2019, IEEE Transactions on Intelligent Transportation Systems.

[8]  Zijian Liu,et al.  A novel generative adversarial network for estimation of trip travel time distribution with trajectory data , 2019, Transportation Research Part C: Emerging Technologies.

[9]  Juneyoung Park,et al.  Evaluation and augmentation of traffic data including Bluetooth detection system on arterials , 2019, Journal of Intelligent Transportation Systems.

[10]  Kai Xu,et al.  Targeted Knowledge Transfer for Learning Traffic Signal Plans , 2019, PAKDD.

[11]  Saiedeh N. Razavi,et al.  Asynchronous n-step Q-learning adaptive traffic signal control , 2019, J. Intell. Transp. Syst..

[12]  Wojciech Czarnecki,et al.  Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.

[13]  Abhinandan H. Patil,et al.  Automatic Number Plate Recognition , 2018, 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN).

[14]  Zhenhui Li,et al.  IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control , 2018, KDD.

[15]  Guiling Wang,et al.  Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks , 2018, ArXiv.

[16]  Xiaolei Ma,et al.  Vehicle trajectory reconstruction from automatic license plate reader data , 2018, Int. J. Distributed Sens. Networks.

[17]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[18]  Stefanie Tellex,et al.  Implementing the Deep Q-Network , 2017, ArXiv.

[19]  Xinkai Wu,et al.  Evaluation of Actuated, Coordinated, and Adaptive Signal Control Systems: A Case Study , 2017 .

[20]  Yuval Tassa,et al.  Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.

[21]  Mee Hong Ling,et al.  A Survey on Reinforcement Learning Models and Algorithms for Traffic Signal Control , 2017, ACM Comput. Surv..

[22]  Minoru Ito,et al.  Adaptive Traffic Signal Control: Deep Reinforcement Learning Algorithm with Experience Replay and Target Network , 2017, ArXiv.

[23]  Peter Corcoran,et al.  Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning , 2017, ArXiv.

[24]  Noe Casas,et al.  Deep Deterministic Policy Gradient for Urban Traffic Light Control , 2017, ArXiv.

[25]  Mengqi Liu,et al.  Cooperative Deep Reinforcement Learning for Tra ic Signal Control , 2017 .

[26]  Saiedeh N. Razavi,et al.  Using a Deep Reinforcement Learning Agent for Traffic Signal Control , 2016, ArXiv.

[27]  Li Li,et al.  Traffic signal timing via deep reinforcement learning , 2016, IEEE/CAA Journal of Automatica Sinica.

[28]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[29]  Jim Duggan,et al.  An Experimental Review of Reinforcement Learning Algorithms for Adaptive Traffic Signal Control , 2016, Autonomic Road Transport Support Systems.

[30]  Xiaoliang Ma,et al.  Adaptive Group-Based Signal Control Using Reinforcement Learning with Eligibility Traces , 2015, 2015 IEEE 18th International Conference on Intelligent Transportation Systems.

[31]  Thomas Urbanik,et al.  Signal Timing Manual - Second Edition , 2015 .

[32]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[33]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[35]  Shalabh Bhatnagar,et al.  Decentralized learning for traffic signal control , 2015, 2015 7th International Conference on Communication Systems and Networks (COMSNETS).

[36]  Kevin Lee,et al.  Signal Timing Manual , 2015 .

[37]  Henry X. Liu,et al.  Operation of Traffic Signal Systems in Oversaturated Conditions, Volume 1: Practitioner Guidance , 2014 .

[38]  Kenneth Tze Kin Teo,et al.  Agent-Based Traffic Flow Optimization at Multiple Signalized Intersections , 2014, 2014 8th Asia Modelling Symposium.

[39]  Sharul Kamal Abdul Rahim,et al.  RFID Vehicle Plate Number (E-Plate) for Tracking and Management System , 2013, 2013 International Conference on Parallel and Distributed Systems.

[40]  Ke Liu,et al.  Bayesian Dynamic Linear Model with Switching for Real-Time Short-Term Freeway Travel Time Prediction with License Plate Recognition Data , 2013 .

[41]  Haris N. Koutsopoulos,et al.  Estimation of Arterial Travel Time from Automatic Number Plate Recognition Data , 2013 .

[42]  Juan C. Medina,et al.  Traffic signal control using reinforcement learning and the max-plus algorithm as a coordinating strategy , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[43]  Walid Gomaa,et al.  Multi-objective traffic light control system based on Bayesian probability interpretation , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[44]  Dongbin Zhao,et al.  Computational Intelligence in Urban Traffic Signal Control: A Survey , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[45]  Monireh Abdoos,et al.  Traffic light control in non-stationary environments based on multi agent Q-learning , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[46]  Shalabh Bhatnagar,et al.  Reinforcement Learning With Function Approximation for Traffic Signal Control , 2011, IEEE Transactions on Intelligent Transportation Systems.

[47]  Thomas J. Walsh,et al.  Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.

[48]  T. Urbanik,et al.  Reinforcement learning-based multi-agent system for network traffic signal control , 2010 .

[49]  Cecilia Wilson,et al.  Speed cameras for the prevention of road traffic injuries and deaths. , 2010, The Cochrane database of systematic reviews.

[50]  Ana L. C. Bazzan,et al.  I TSUMO: an Agent-Based Simulator for ITS Applications , 2010 .

[51]  Lee D. Han,et al.  Tracking Large Trucks in Real Time with License Plate Recognition and Text-Mining Techniques , 2009 .

[52]  Chen Cai,et al.  Adaptive traffic signal control using approximate dynamic programming , 2009 .

[53]  Wang Meng,et al.  Urban Traffic Signal Learning Control Using Fuzzy Actor-Critic Methods , 2009, 2009 Fifth International Conference on Natural Computation.

[54]  Vinny Cahill,et al.  A Collaborative Reinforcement Learning Approach to Urban Traffic Control Optimization , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[55]  Xinkai Wu,et al.  Development of a Real-Time Arterial Performance Monitoring System Using Traffic Data Available from Existing Signal Systems , 2008 .

[56]  Jeffrey C. Luvall,et al.  Integrated Modeling Framework , 2008 .

[57]  Chen-Khong Tham,et al.  SensorGrid for Real-Time Traffic Management , 2007, 2007 3rd International Conference on Intelligent Sensors, Sensor Networks and Information.

[58]  Francois Dion,et al.  Estimating dynamic roadway travel times using automatic vehicle identification data for low sampling rates , 2006 .

[59]  Xiaoping Zhang,et al.  Improved Dual-Loop Detection System for Collecting Real-Time Truck Data , 2005, Transportation Research Record: Journal of the Transportation Research Board.

[60]  Xiaoping Zhang,et al.  Development of a System for Collecting Loop-Detector Event Data for Individual Vehicles , 2003 .

[61]  Benjamin Coifman,et al.  EVENT DATA BASED TRAFFIC DETECTOR VALIDATION TESTS , 2004 .

[62]  P. Shuldiner,et al.  Determining Detailed Origin-Destination and Travel Time Patterns Using Video and Machine Vision License Plate Matching , 1996 .

[63]  Gail E. Kaiser,et al.  The Decentralized Model , 1995 .

[64]  R D Bretherton,et al.  SCOOT-a Traffic Responsive Method of Coordinating Signals , 1981 .