Cooperative Deep Q-Learning With Q-Value Transfer for Multi-Intersection Signal Control

The problem of adaptive traffic signal control in the multi-intersection system has attracted the attention of researchers. Among the existing methods, reinforcement learning has shown to be effective. However, the complex intersection features, heterogeneous intersection structures, and dynamic coordination for multiple intersections pose challenges for reinforcement learning-based algorithms. This paper proposes a cooperative deep Q-network with Q-value transfer (QT-CDQN) for adaptive multi-intersection signal control. In QT-CDQN, a multi-intersection traffic network in a region is modeled as a multi-agent reinforcement learning system. Each agent searches the optimal strategy to control an intersection by a deep Q-network that takes the discrete state encoding of traffic information as the network inputs. To work cooperatively, the agent considers the influence of the latest actions of its adjacencies in the process of policy learning. Especially, the optimal Q-values of the neighbor agents at the latest time step are transferred to the loss function of the Q-network. Moreover, the strategy of the target network and the mechanism of experience replay are used to improve the stability of the algorithm. The advantages of QT-CDQN lie not only in the effectiveness and scalability for the multi-intersection system but also in the versatility to deal with the heterogeneous intersection structures. The experimental studies under different road structures show that the QT-CDQN is competitive in terms of average queue length, average speed, and average waiting time when compared with the state-of-the-art algorithms. Furthermore, the experiments of recurring congestion and occasional congestion validate the adaptability of the QT-CDQN to dynamic traffic environments.

[1]  Florian Metze,et al.  Extracting deep bottleneck features using stacked auto-encoders , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Aleksandar Stevanovic,et al.  Adaptive Traffic Control Systems: Domestic and Foreign State of Practice , 2010 .

[3]  Baher Abdulhai,et al.  Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and Large-Scale Application on Downtown Toronto , 2013, IEEE Transactions on Intelligent Transportation Systems.

[4]  Keemin Sohn,et al.  Artificial intelligence for traffic signal control based solely on video images , 2018, J. Intell. Transp. Syst..

[5]  Peter Corcoran,et al.  Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning , 2017, ArXiv.

[6]  Mauricio Camargo,et al.  Multi-objective traffic signal optimization using 3D mesoscopic simulation and evolutionary algorithms , 2018, Simul. Model. Pract. Theory.

[7]  Juan C. Medina,et al.  Traffic signal control using reinforcement learning and the max-plus algorithm as a coordinating strategy , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[8]  Fasel Ian,et al.  Deep Belief Nets as Function Approximators for Reinforcement Learning , 2011 .

[9]  Shalabh Bhatnagar,et al.  Reinforcement learning with average cost for adaptive control of traffic lights at intersections , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[10]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[11]  H. R. Maleki,et al.  Maximum Green Time Settings for Traffic-Actuated Signal Control at Isolated Intersections Using Fuzzy Logic , 2015, 2015 4th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS).

[12]  Abbas Khosravi,et al.  Intelligent Traffic Light Control of Isolated Intersections Using Machine Learning Methods , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[13]  Shimon Whiteson,et al.  Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs , 2008, ECML/PKDD.

[14]  Yonghua Zhou,et al.  Distributed coordination control of traffic network flow using adaptive genetic algorithm based on cloud computing , 2018, J. Netw. Comput. Appl..

[15]  Nasser Mozayani,et al.  Enhancing Nash Q-learning and Team Q-learning mechanisms by using bottlenecks , 2014, J. Intell. Fuzzy Syst..

[16]  Joel Z. Leibo,et al.  Prefrontal cortex as a meta-reinforcement learning system , 2018, bioRxiv.

[17]  Shalabh Bhatnagar,et al.  Reinforcement Learning With Function Approximation for Traffic Signal Control , 2011, IEEE Transactions on Intelligent Transportation Systems.

[18]  Jim Duggan,et al.  An Experimental Review of Reinforcement Learning Algorithms for Adaptive Traffic Signal Control , 2016, Autonomic Road Transport Support Systems.

[19]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[20]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[21]  Xiaobo Lu,et al.  Optimal Type-2 Fuzzy System For Arterial Traffic Signal Control , 2018, IEEE Transactions on Intelligent Transportation Systems.

[22]  Minoru Ito,et al.  Adaptive Traffic Signal Control: Deep Reinforcement Learning Algorithm with Experience Replay and Target Network , 2017, ArXiv.

[23]  R Tavakkoli Moghaddam,et al.  A new nondominated sorting genetic algorithm based on the regression line for fuzzy tra c signal optimization problem , 2018 .

[24]  Rahim F Benekohal,et al.  Reinforcement Learning Agents for Traffic Signal Control in Oversaturated Networks , 2011 .

[25]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[26]  Meng-Hui Wang,et al.  An Intelligent Traffic Light Control Based on Extension Neural Network , 2008, KES.

[27]  K.T.K. Teo,et al.  Optimization of Traffic Flow within an Urban Traffic Light Intersection with Genetic Algorithm , 2010, 2010 Second International Conference on Computational Intelligence, Modelling and Simulation.

[28]  Xiaoliang Ma,et al.  Adaptive Group-based Signal Control by Reinforcement Learning☆ , 2015 .

[29]  Frans A. Oliehoek,et al.  Coordinated Deep Reinforcement Learners for Traffic Light Control , 2016 .

[30]  Mohammad Hassan Khooban,et al.  A time-varying strategy for urban traffic network control: a fuzzy logic control based on an improved black hole algorithm , 2017, Int. J. Bio Inspired Comput..

[31]  Li Li,et al.  Traffic signal timing via deep reinforcement learning , 2016, IEEE/CAA Journal of Automatica Sinica.

[32]  Wenchen Yang,et al.  Optimized two-stage fuzzy control for urban traffic signals at isolated intersection and Paramics simulation , 2012, ITSC.

[33]  Noe Casas,et al.  Deep Deterministic Policy Gradient for Urban Traffic Light Control , 2017, ArXiv.

[34]  Serge Stinckwich,et al.  Adaptive Traffic Signal Control : Exploring Reward Definition For Reinforcement Learning , 2017, ANT/SEIT.

[35]  Pang Ha-li,et al.  An Intersection Signal Control Method Based on Deep Reinforcement Learning , 2017, 2017 10th International Conference on Intelligent Computation Technology and Automation (ICICTA).

[36]  T. Urbanik,et al.  Reinforcement learning-based multi-agent system for network traffic signal control , 2010 .

[37]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[38]  Chia-Hao Wan,et al.  Value‐based deep reinforcement learning for adaptive isolated intersection signal control , 2018, IET Intelligent Transport Systems.

[39]  Shalabh Bhatnagar,et al.  Multi-agent reinforcement learning for traffic signal control , 2014, 17th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[40]  Xiangjie Kong,et al.  Study on Road Network Traffic Coordination Control Technique With Bus Priority , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[41]  Hao Yang,et al.  Eco-Cooperative Adaptive Cruise Control at Signalized Intersections Considering Queue Effects , 2017, IEEE Transactions on Intelligent Transportation Systems.

[42]  Dong Shen,et al.  Two intersections traffic signal control method based on ADHDP , 2016, 2016 IEEE International Conference on Vehicular Electronics and Safety (ICVES).

[43]  Nasser Mozayani,et al.  Automatic abstraction controller in reinforcement learning agent via automata , 2014, Appl. Soft Comput..

[44]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[45]  Baher Abdulhai,et al.  Multi-Agent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC) , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[46]  Saiedeh N. Razavi,et al.  Using a Deep Reinforcement Learning Agent for Traffic Signal Control , 2016, ArXiv.

[47]  Xin Xu,et al.  Reinforcement learning algorithms with function approximation: Recent advances and applications , 2014, Inf. Sci..

[48]  Lei Liu,et al.  Intelligent traffic light control using distributed multi-agent Q learning , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[49]  Monireh Abdoos,et al.  Traffic light control in non-stationary environments based on multi agent Q-learning , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).