Reinforcement Learning Based Monte Carlo Tree Search for Temporal Path Discovery

An Attributed Dynamic Graph (ADG) contains multiple dynamic attributes associated with each edge. In ADG based applications, people usually can specify multiple constrains in the attributes to illustrate their requirements, such as the total cost, the total travel time and the stopover interval of a flight between two cities. This inspires a type of Multi-Constrained Temporal Path (MCTP) discovery in ADGs, which is a challenging NP-Complete problem. In order to deliver an efficient and effective temporal path discovery method to be used in real-time environment, we propose a Reinforcement Learning (RL) based, Monte Carlo Tree Search algorithm (RLMCTS). RL-MCTS uses a newly designed memory structure to address the challenges of Monte Carlo Tree Search (MCTS) in MCTP discovery. To the best of our knowledge, RL-MCTS is the first RL algorithm that supports path discovery in ADGs. The experimental results on ten real dynamic graphs demonstrate that our algorithm outperforms the state-of-the-art methods in terms of both efficiency and effectiveness.

[1]  Danai Koutra,et al.  TimeCrunch: Interpretable Dynamic Graph Summarization , 2015, KDD.

[2]  Jeffrey Xu Yu,et al.  Finding time-dependent shortest paths over large graphs , 2008, EDBT '08.

[3]  Yi Yang,et al.  Efficient Route Planning on Public Transportation Networks: A Labelling Approach , 2015, SIGMOD Conference.

[4]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[5]  Yin Yang,et al.  Effective Indexing for Approximate Constrained Shortest Path Queries on Large Road Networks , 2016, Proc. VLDB Endow..

[6]  Afonso Ferreira,et al.  Computing Shortest, Fastest, and Foremost Journeys in Dynamic Networks , 2003, Int. J. Found. Comput. Sci..

[7]  Yi Lu,et al.  Path Problems in Temporal Graphs , 2014, Proc. VLDB Endow..

[8]  Daniel Delling,et al.  Time-Dependent SHARC-Routing , 2008, Algorithmica.

[9]  Jeffrey Xu Yu,et al.  Finding the Cost-Optimal Path with Time Constraint over Time-Dependent Graphs , 2014, Proc. VLDB Endow..

[10]  Jianxin Li,et al.  Most Influential Community Search over Large Social Networks , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[11]  Ernesto Nunes,et al.  Monte Carlo Tree Search for Multi-Robot Task Allocation , 2016, AAAI.

[12]  Jinfeng Li,et al.  Reachability and time-based path queries in temporal graphs , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[13]  Xiaofang Zhou,et al.  Minimal On-Road Time Route Scheduling on Time-Dependent Graphs , 2017, Proc. VLDB Endow..

[14]  Yan Liu,et al.  Study of the Logistics Transportation Vehicle Terminal Path Optimization and Algorithm Based on GIS , 2014 .

[15]  Vassilis Kostakos Temporal Graphs , 2014, Encyclopedia of Social Network Analysis and Mining.

[16]  Mehmet A. Orgun,et al.  Finding the Optimal Social Trust Path for the Selection of Trustworthy Service Providers in Complex Social Networks , 2013, IEEE Transactions on Services Computing.

[17]  Yngvi Björnsson,et al.  Simulation-Based Approach to General Game Playing , 2008, AAAI.

[18]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[19]  Eamonn J. Keogh,et al.  Derivative Dynamic Time Warping , 2001, SDM.

[20]  Jari Saramäki,et al.  Path lengths, correlations, and centrality in temporal networks , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[22]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[23]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[24]  Mehmet A. Orgun,et al.  Multi-Constrained Graph Pattern Matching in large-scale contextual social graphs , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[25]  Philip Chan,et al.  Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[26]  Marwan Krunz,et al.  Multi-constrained optimal path selection , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[27]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[28]  Brian C. Dean,et al.  Algorithms for minimum‐cost paths in time‐dependent networks with waiting policies , 2004, Networks.

[29]  Kai Zheng,et al.  Temporal paths discovery with multiple constraints in attributed dynamic graphs , 2019, World Wide Web.

[30]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[31]  Mehmet A. Orgun,et al.  Optimal Social Trust Path Selection in Complex Social Networks , 2010, AAAI.

[32]  Jon M. Kleinberg,et al.  The structure of information pathways in a social communication network , 2008, KDD.

[33]  Hejun Wu,et al.  Efficient Algorithms for Temporal Path Computation , 2016, IEEE Transactions on Knowledge and Data Engineering.