Streaming Local Community Detection Through Approximate Conductance

Community is a universal structure in various complex networks, and community detection is a fundamental task for network analysis. With the rapid growth of network scale, networks are massive, changing rapidly and could naturally be modeled as graph streams. Due to the limited memory and access constraint in graph streams, existing non-streaming community detection methods are no longer applicable. This raises an emerging need for online approaches. In this work, we consider the problem of uncovering the local community containing a few query nodes in graph streams, termed streaming local community detection. This is a new problem raised recently that is more challenging for community detection and only a few works address this online setting. Correspondingly, we design an online single-pass streaming local community detection approach. Inspired by the “local” property of communities, our method samples the local structure around the query nodes in graph streams, and extracts the target community on the sampled subgraph using our proposed metric called the approximate conductance. Comprehensive experiments show that our method remarkably outperforms the streaming baseline on both effectiveness and efficiency, and even achieves similar accuracy comparing to the state-of-the-art non-streaming local community detection methods that use static and complete graphs.

[1]  Kun He,et al.  Uncovering the Small Community Structure in Large Networks: A Local Spectral Approach , 2015, WWW.

[2]  Alexandre Hollocou,et al.  A linear streaming algorithm for community detection in very large networks , 2017, ArXiv.

[3]  Venkata Rama Kiran Garimella,et al.  Secular vs. Islamist polarization in Egypt on Twitter , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[4]  Dino Pedreschi,et al.  DEMON: a local-first discovery method for overlapping communities , 2012, KDD.

[5]  Alex Delis,et al.  Realizing Memory-Optimized Distributed Graph Processing , 2018, IEEE Transactions on Knowledge and Data Engineering.

[6]  Roberto Grossi,et al.  D2K: Scalable Community Detection in Massive Networks via Small-Diameter k-Plexes , 2018, KDD.

[7]  Charu C. Aggarwal,et al.  Link prediction in graph streams , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[8]  A. Kemper,et al.  On Graph Problems in a Semi-streaming Model , 2015 .

[9]  Ziv Bar-Yossef,et al.  Reductions in streaming algorithms, with an application to counting triangles in graphs , 2002, SODA '02.

[10]  David F. Gleich,et al.  A Simple and Strongly-Local Flow-Based Method for Cut Improvement , 2016, ICML.

[11]  Charu C. Aggarwal,et al.  Query-friendly compression of graph streams , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[12]  Nesreen K. Ahmed,et al.  Adaptive Shrinkage Estimation for Streaming Graphs , 2020, NeurIPS.

[13]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[14]  Kun He,et al.  Detecting Overlapping Communities from Local Spectral Subspaces , 2015, 2015 IEEE International Conference on Data Mining.

[15]  David F. Gleich,et al.  Heat kernel based community detection , 2014, KDD.

[16]  Yuchen Bian,et al.  Local Community Detection in Multiple Networks , 2020, KDD.

[17]  Alex Delis,et al.  Rapid Detection of Local Communities in Graph Streams , 2022, IEEE Transactions on Knowledge and Data Engineering.

[18]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[19]  Junming Shao,et al.  Community Detection with Local Metric Learning , 2020, 2020 IEEE International Conference on Data Mining (ICDM).

[20]  Dejing Dou,et al.  Rethinking Local Community Detection: Query Nodes Replacement , 2020, 2020 IEEE International Conference on Data Mining (ICDM).

[21]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[22]  John E. Hopcroft,et al.  Krylov Subspace Approximation for Local Community Detection in Large Networks , 2017, ACM Trans. Knowl. Discov. Data.

[23]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[24]  Sergei Vassilvitskii,et al.  Densest Subgraph in Streaming and MapReduce , 2012, Proc. VLDB Endow..

[25]  Lijun Chang,et al.  Index-Based Densest Clique Percolation Community Search in Networks , 2018, IEEE Transactions on Knowledge and Data Engineering.

[26]  Ryan A. Rossi,et al.  On Sampling from Massive Graph Streams , 2017, Proc. VLDB Endow..

[27]  Lei Zou,et al.  Fast and Accurate Graph Stream Summarization , 2018, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[28]  Philip S. Yu,et al.  Outlier detection in graph streams , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[29]  Chang-Dong Wang,et al.  EdMot: An Edge Enhancement Approach for Motif-aware Community Detection , 2019, KDD.

[30]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Santo Fortunato,et al.  Finding Statistically Significant Communities in Networks , 2010, PloS one.

[32]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Haixun Wang,et al.  Online search of overlapping communities , 2013, SIGMOD '13.

[34]  Sudipto Guha,et al.  Analyzing graph structure via linear measurements , 2012, SODA.

[35]  Jeffrey Xu Yu,et al.  Querying k-truss community in large and dynamic graphs , 2014, SIGMOD Conference.

[36]  Alexandre Proutière,et al.  Streaming, Memory Limited Algorithms for Community Detection , 2014, NIPS.

[37]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[39]  Qing Chen,et al.  Graph Stream Summarization: From Big Bang to Big Crunch , 2016, SIGMOD Conference.

[40]  Andrew McGregor,et al.  Graph stream algorithms: a survey , 2014, SGMD.

[41]  Fanghua Ye,et al.  Discrete Overlapping Community Detection with Pseudo Supervision , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[42]  Wei Wang,et al.  On Multi-query Local Community Detection , 2018, 2018 IEEE International Conference on Data Mining (ICDM).