Finding the Best k in Core Decomposition: A Time and Space Optimal Solution

The mode of k-core and its hierarchical decomposition have been applied in many areas, such as sociology, the world wide web, and biology. Algorithms on related studies often need an input value of parameter k, while there is no existing solution other than manual selection. In this paper, given a graph and a scoring metric, we aim to efficiently find the best value of k such that the score of the k-core (or k-core set) is the highest. The problem is challenging because there are various community scoring metrics and the computation is costly on large datasets. With the well-designed vertex ordering techniques, we propose time and space optimal algorithms to compute the best k, which are applicable to most community metrics. The proposed algorithms can compute the score of every k-core (set) and can benefit the solutions to other k-core related problems. Extensive experiments are conducted on 10 real-world networks with size up to billion-scale, which validates both the efficiency of our algorithms and the effectiveness of the resulting k-cores.

[1]  Matthieu Latapy,et al.  Main-memory triangle computations for very large (sparse (power-law)) graphs , 2008, Theor. Comput. Sci..

[2]  Dimitrios M. Thilikos,et al.  Evaluating Cooperation in Communities with the k-Core Structure , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[3]  Flaviano Morone,et al.  The k-core as a predictor of structural collapse in mutualistic ecosystems , 2018, Nature Physics.

[4]  Yuval Shavitt,et al.  A model of Internet topology using k-shell decomposition , 2007, Proceedings of the National Academy of Sciences.

[5]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[6]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Marco Pellegrini,et al.  Extraction and classification of dense implicit communities in the Web graph , 2009, TWEB.

[8]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[9]  Yong Deng,et al.  Weighted k-shell decomposition for complex networks based on potential edge weights , 2015 .

[10]  Jeffrey Xu Yu,et al.  Querying k-truss community in large and dynamic graphs , 2014, SIGMOD Conference.

[11]  Michael Weiner,et al.  Breakdown of Brain Connectivity Between Normal Aging and Alzheimer's Disease: A Structural k-Core Network Analysis , 2013, Brain Connect..

[12]  Pierre Hansen,et al.  Modularity maximization in networks by variable neighborhood search , 2011, Graph Partitioning and Graph Clustering.

[13]  Anthony K. H. Tung,et al.  Large Scale Cohesive Subgraphs Discovery for Social Network Visual Analysis , 2012, Proc. VLDB Endow..

[14]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[15]  A. Clauset Finding local community structure in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[17]  Alessandro Vespignani,et al.  Large scale networks fingerprinting and visualization using the k-core decomposition , 2005, NIPS.

[18]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[19]  Hanghang Tong,et al.  Network Connectivity Optimization: Fundamental Limits and Effective Algorithms , 2018, KDD.

[20]  A. Arenas,et al.  Motif-based communities in complex networks , 2007, 0710.0059.

[21]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[22]  Jian Pei,et al.  On mining cross-graph quasi-cliques , 2005, KDD '05.

[23]  Ali Pinar,et al.  Fast Hierarchy Construction for Dense Subgraphs , 2016, Proc. VLDB Endow..

[24]  Dimitrios M. Thilikos,et al.  CoreCluster: A Degeneracy Based Graph Clustering Framework , 2014, AAAI.

[25]  Christos Faloutsos,et al.  CoreScope: Graph Mining Using k-Core Analysis — Patterns, Anomalies and Algorithms , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[26]  Alex Thomo,et al.  K-Core Decomposition of Large Networks on a Single PC , 2015, Proc. VLDB Endow..

[27]  Michalis Vazirgiannis,et al.  Locating influential nodes in complex networks , 2016, Scientific Reports.

[28]  Jeffrey Xu Yu,et al.  Finding maximal cliques in massive networks , 2011, TODS.

[29]  Liran Katzir,et al.  Estimating clustering coefficients and size of social networks via random walk , 2013, TWEB.

[30]  Lev Muchnik,et al.  Identifying influential spreaders in complex networks , 2010, 1001.5285.

[31]  Niloy Ganguly,et al.  Metrics for Community Analysis , 2016, ACM Comput. Surv..

[32]  Alessandro Vespignani,et al.  K-core decomposition of Internet graphs: hierarchies, self-similarity and measurement biases , 2005, Networks Heterog. Media.

[33]  Jianxin Li,et al.  Maximum Co-located Community Search in Large Scale Social Networks , 2018, Proc. VLDB Endow..

[34]  Lars Backstrom,et al.  Structural diversity in social contagion , 2012, Proceedings of the National Academy of Sciences.

[35]  Frank Schweitzer,et al.  A k-shell decomposition method for weighted networks , 2012, ArXiv.

[36]  Pol Colomer-de-Simon,et al.  Deciphering the global organization of clustering in real complex networks , 2013, Scientific Reports.

[37]  Bernhard A. Sabel,et al.  Dynamic reorganization of brain functional networks during cognition , 2015, NeuroImage.

[38]  Charalampos E. Tsourakakis,et al.  Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees , 2013, KDD.

[39]  Jia Li,et al.  Query Optimal k-Plex Based Community in Graphs , 2017, Data Science and Engineering.

[40]  Jianzhong Li,et al.  Diversified Coherent Core Search on Multi-Layer Graphs , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[41]  E. Almaas,et al.  s-core network decomposition: a generalization of k-core analysis to weighted networks. , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[42]  Lei Chen,et al.  Efficient cohesive subgraphs detection in parallel , 2014, SIGMOD Conference.

[43]  Kai Wang,et al.  Efficient Computing of Radius-Bounded k-Cores , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[44]  Francesco De Pellegrini,et al.  General , 1895, The Social History of Alcohol Review.

[45]  Jianguo Liu,et al.  Identifying the node spreading influence with largest k-core values , 2014 .

[46]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[47]  Weifa Liang,et al.  Efficiently computing k-edge connected components via graph decomposition , 2013, SIGMOD '13.

[48]  Kasturi Dewi Varathan,et al.  Identification of influential spreaders in online social networks using interaction weighted K-core decomposition method , 2017 .

[49]  Wei Cai,et al.  Using the k-core decomposition to analyze the static structure of large-scale software systems , 2010, The Journal of Supercomputing.

[50]  Fan Zhang,et al.  When Engagement Meets Similarity: Efficient (k, r)-Core Computation on Social Networks , 2016, Proc. VLDB Endow..

[51]  Fanghua Ye,et al.  Skyline Community Search in Multi-valued Networks , 2018, SIGMOD Conference.

[52]  Lijun Chang,et al.  I/O efficient ECC graph decomposition via graph reduction , 2016, The VLDB Journal.

[53]  Jing Li,et al.  Robust Local Community Detection: On Free Rider Effect and Its Elimination , 2015, Proc. VLDB Endow..

[54]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[55]  Jeffrey Xu Yu,et al.  I/O efficient Core Graph Decomposition at web scale , 2015, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[56]  Mikko Kivelä,et al.  Generalizations of the clustering coefficient to weighted complex networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[57]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[58]  Reynold Cheng,et al.  Efficient Algorithms for Densest Subgraph Discovery , 2019, Proc. VLDB Endow..

[59]  Lijun Chang,et al.  Efficient Maximum Clique Computation over Large Sparse Graphs , 2019, KDD.

[60]  Xiaodong Li,et al.  Effective Community Search over Large Spatial Graphs , 2017, Proc. VLDB Endow..

[61]  Leland L. Beck,et al.  Smallest-last ordering and clustering and graph coloring algorithms , 1983, JACM.

[62]  Mohamed Roushdy,et al.  Effectiveness of the K-core Nodes as Seeds for Influence Maximisation in Dynamic Cascades , 2017 .

[63]  Jeffrey Xu Yu,et al.  Persistent Community Search in Temporal Networks , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[64]  Lu Chen,et al.  Contextual Community Search Over Large Social Networks , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[65]  Antoine Allard,et al.  Percolation on random networks with arbitrary k-core structure. , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[66]  Jia Wang,et al.  Truss Decomposition in Massive Networks , 2012, Proc. VLDB Endow..

[67]  Weifa Liang,et al.  Finding maximal k-edge-connected subgraphs from a large graph , 2012, EDBT '12.