Mining Persistent Activity in Continually Evolving Networks

Frequent pattern mining is a key area of study that gives insights into the structure and dynamics of evolving networks, such as social or road networks. However, not only does a network evolve, but often the way that it evolves, itself evolves. Thus, knowing, in addition to patterns' frequencies, for how long and how regularly they have occurred-i.e., their persistence-can add to our understanding of evolving networks. In this work, we propose the problem of mining activity that persists through time in continually evolving networks-i.e., activity that repeatedly and consistently occurs. We extend the notion of temporal motifs to capture activity among specific nodes, in what we call activity snippets, which are small sequences of edge-updates that reoccur. We propose axioms and properties that a measure of persistence should satisfy, and develop such a persistence measure. We also propose PENminer, an efficient framework for mining activity snippets' Persistence in Evolving Networks, and design both offline and streaming algorithms. We apply PENminer to numerous real, large-scale evolving networks and edge streams, and find activity that is surprisingly regular over a long period of time, but too infrequent to be discovered by aggregate count alone, and bursts of activity exposed by their lack of persistence. Our findings with PENminer include neighborhoods in NYC where taxi traffic persisted through Hurricane Sandy, the opening of new bike-stations, characteristics of social network users, and more. Moreover, we use PENminer towards identifying anomalies in multiple networks, outperforming baselines at identifying subtle anomalies by 9.8-48% in AUC.

[1]  Sutanay Choudhury,et al.  Frequent Subgraph Discovery in Large Attributed Streaming Graphs , 2014, BigMine.

[2]  Christos Faloutsos,et al.  MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams , 2019, AAAI.

[3]  Danai Koutra,et al.  Graph Summarization Methods and Applications: A Survey , 2016 .

[4]  Frans Coenen,et al.  A survey of frequent subgraph mining algorithms , 2012, The Knowledge Engineering Review.

[5]  Austin R. Benson,et al.  Sampling Methods for Counting Temporal Motifs , 2019, WSDM.

[6]  Panos Kalnis,et al.  GRAMI: Frequent Subgraph and Pattern Mining in a Single Large Graph , 2014, Proc. VLDB Endow..

[7]  Robert K. Cunningham,et al.  Results of the DARPA 1998 Offline Intrusion Detection Evaluation , 1999, Recent Advances in Intrusion Detection.

[8]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[9]  Jari Saramäki,et al.  Temporal motifs in time-dependent networks , 2011, ArXiv.

[10]  George Karypis,et al.  Algorithms for Mining the Coevolving Relational Motifs in Dynamic Networks , 2015, ACM Trans. Knowl. Discov. Data.

[11]  Christos Faloutsos,et al.  SedanSpot: Detecting Anomalies in Edge Streams , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[12]  Aristides Gionis,et al.  Mining Frequent Patterns in Evolving Graphs , 2018, CIKM.

[13]  Qi He,et al.  Communication motifs: a tool to characterize social communications , 2010, CIKM.

[14]  Sudipto Guha,et al.  Robust Random Cut Forest Based Anomaly Detection on Streams , 2016, ICML.

[15]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[16]  Danai Koutra,et al.  TimeCrunch: Interpretable Dynamic Graph Summarization , 2015, KDD.

[17]  Balaraman Ravindran,et al.  COMMIT: A Scalable Approach to Mining Communication Motifs from Dynamic Networks , 2015, SIGMOD Conference.

[18]  Jeffrey Xu Yu,et al.  Persistent Community Search in Temporal Networks , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[19]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[20]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[21]  Lorenzo De Stefani,et al.  TRIÈST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fixed Memory Size , 2016, KDD.

[22]  Bushra Mina,et al.  Battling Superstorm Sandy at Lenox Hill Hospital: When the Hospital Is Ground Zero. , 2019, Critical care clinics.

[23]  Haipeng Dai,et al.  Finding Persistent Items in Data Streams , 2016, Proc. VLDB Endow..

[24]  Evaggelia Pitoura,et al.  Finding lasting dense subgraphs , 2016, Data Mining and Knowledge Discovery.

[25]  Philip S. Yu,et al.  On dense pattern mining in graph streams , 2010, Proc. VLDB Endow..

[26]  Nikos Mamoulis,et al.  Flow Motifs in Interaction Networks , 2018, EDBT.

[27]  Panos Kalnis,et al.  Incremental Frequent Subgraph Mining on Large Evolving Graphs , 2017, IEEE Transactions on Knowledge and Data Engineering.

[28]  Jure Leskovec,et al.  Motifs in Temporal Networks , 2016, WSDM.