Efficient Sampling Algorithms for Approximate Temporal Motif Counting

A great variety of complex systems ranging from user interactions in communication networks to transactions in financial markets can be modeled as temporal graphs, which consist of a set of vertices and a series of timestamped and directed edges. Temporal motifs in temporal graphs are generalized from subgraph patterns in static graphs which take into account edge orderings and durations in addition to structures. Counting the number of occurrences of temporal motifs is a fundamental problem for temporal network analysis. However, existing methods either cannot support temporal motifs or suffer from performance issues. In this paper, we focus on approximate temporal motif counting via random sampling. We first propose a generic edge sampling (ES) algorithm for estimating the number of instances of any temporal motif. Furthermore, we devise an improved EWS algorithm that hybridizes edge sampling with wedge sampling for counting temporal motifs with 3 vertices and 3 edges. We provide comprehensive analyses of the theoretical bounds and complexities of our proposed algorithms. Finally, we conduct extensive experiments on several real-world datasets, and the results show that our ES and EWS algorithms have higher efficiency, better accuracy, and greater scalability than the state-of-the-art sampling method for temporal motif counting.

[1]  Jianguo Lu,et al.  Efficient Estimation of Triangles in Very Large Graphs , 2016, CIKM.

[2]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[3]  Yuchen Li,et al.  GPU-Accelerated Subgraph Enumeration on Partitioned Graphs , 2020, SIGMOD Conference.

[4]  Kian-Lee Tan,et al.  Semantic and Influence aware k-Representative Queries over Social Streams , 2019, EDBT.

[5]  Charalampos E. Tsourakakis,et al.  Colorful triangle counting and a MapReduce implementation , 2011, Inf. Process. Lett..

[6]  Rolf Niedermeier,et al.  Enumerating maximal cliques in temporal graphs , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[7]  Jure Leskovec,et al.  Motifs in Temporal Networks , 2016, WSDM.

[8]  Sutanay Choudhury,et al.  A Chronological Edge-Driven Approach to Temporal Subgraph Isomorphism , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[9]  Srikanta Tirthapura,et al.  Butterfly Counting in Bipartite Networks , 2017, KDD.

[10]  Mohammad Hossein Namaki,et al.  Discovering Graph Temporal Association Rules , 2017, CIKM.

[11]  Nikos Mamoulis,et al.  Flow Motifs in Interaction Networks , 2018, EDBT.

[12]  Austin R. Benson,et al.  Sampling Methods for Counting Temporal Motifs , 2019, WSDM.

[13]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[14]  Rūta Užupytė,et al.  Test for triadic closure and triadic protection in temporal relational event data , 2020, Social Network Analysis and Mining.

[15]  Yongsub Lim,et al.  MASCOT: Memory-efficient and Accurate Sampling for Counting Local Triangles in Graph Streams , 2015, KDD.

[16]  Seshadhri Comandur,et al.  A Fast and Provable Method for Estimating Clique Counts Using Turán's Theorem , 2016, WWW.

[17]  Katherine Faust,et al.  A puzzle concerning triads in social networks: Graph constraints and the triad census , 2010, Soc. Networks.

[18]  Xiangliang Zhang,et al.  MOSS-5: A Fast Method of Approximating Counts of 5-Node Graphlets in Large Graphs , 2018, IEEE Transactions on Knowledge and Data Engineering.

[19]  Kian-Lee Tan,et al.  Real-Time Influence Maximization on Dynamic Social Streams , 2017, Proc. VLDB Endow..

[20]  Donald F. Towsley,et al.  Minfer: A method of inferring motif statistics from sampled edges , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[21]  Ata Turk,et al.  Edge-Based Wedge Sampling to Estimate Triangle Counts in Very Large Graphs , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[22]  Ata Turk,et al.  Revisiting Wedge Sampling for Triangle Counting , 2019, WWW.

[23]  Ali Pinar,et al.  ESCAPE: Efficiently Counting All 5-Vertex Subgraphs , 2016, WWW.

[24]  Matthieu Latapy,et al.  Computing maximal cliques in link streams , 2015, Theor. Comput. Sci..

[25]  Tamara G. Kolda,et al.  Triadic Measures on Graphs: The Power of Wedge Sampling , 2012, SDM.

[26]  Jing Tao,et al.  Approximately Counting Triangles in Large Graph Streams Including Edge Duplicates with a Fixed Memory Usage , 2017, Proc. VLDB Endow..

[27]  Kian-Lee Tan,et al.  Location-aware Influence Maximization over Dynamic Social Streams , 2018, ACM Trans. Inf. Syst..

[28]  Lorenzo De Stefani,et al.  TRIÈST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fixed Memory Size , 2016, KDD.

[29]  Christos Faloutsos,et al.  DOULION: counting triangles in massive graphs with a coin , 2009, KDD.

[30]  Ravi Kumar,et al.  Counting Graphlets: Space vs Time , 2017, WSDM.

[31]  Jari Saramäki,et al.  Temporal motifs in time-dependent networks , 2011, ArXiv.

[32]  Ramana Rao Kompella,et al.  Graph sample and hold: a framework for big-graph analytics , 2014, KDD.

[33]  Ali Pinar,et al.  Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts , 2014, WWW.

[34]  Ciro Cattuto,et al.  Mining (maximal) Span-cores from Temporal Networks , 2018, CIKM.

[35]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[36]  Rok Sosic,et al.  SNAP , 2016, ACM Trans. Intell. Syst. Technol..

[37]  Matthieu Latapy,et al.  Revealing contact patterns among high-school students using maximal cliques in link streams , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[38]  Donald F. Towsley,et al.  Efficiently Estimating Motif Statistics of Large Networks , 2013, TKDD.

[39]  Bin Wu,et al.  Counting Triangles in Large Graphs by Random Sampling , 2016, IEEE Transactions on Knowledge and Data Engineering.

[40]  Kian-Lee Tan,et al.  Parallel Personalized Pagerank on Dynamic Graphs , 2017, Proc. VLDB Endow..

[41]  Lav R. Varshney,et al.  Structural Properties of the Caenorhabditis elegans Neuronal Network , 2009, PLoS Comput. Biol..

[42]  Bingsheng He,et al.  Accelerating Dynamic Graph Analytics on GPUs , 2017, Proc. VLDB Endow..

[43]  Qi He,et al.  Communication motifs: a tool to characterize social communications , 2010, CIKM.

[44]  Kun-Lung Wu,et al.  Counting and Sampling Triangles from a Graph Stream , 2013, Proc. VLDB Endow..

[45]  Balaraman Ravindran,et al.  COMMIT: A Scalable Approach to Mining Communication Motifs from Dynamic Networks , 2015, SIGMOD Conference.

[46]  Jeffrey Xu Yu,et al.  Persistent Community Search in Temporal Networks , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[47]  Jari Saramäki,et al.  Temporal Networks , 2011, Encyclopedia of Social Network Analysis and Mining.

[48]  Toon Calders,et al.  2SCENT: An Efficient Algorithm to Enumerate All Simple Temporal Cycles , 2018, Proc. VLDB Endow..