Efficient Distributed Approaches to Core Maintenance on Large Dynamic Graphs

As a fundamental problem in graph analysis, core decomposition aims to compute the core numbers of vertices in a given graph. It is a powerful tool for mining important graph structures. For dynamic graphs with real-time updates of vertices/edges, core maintenance has been utilized to update the core numbers of vertices. The previous approaches to core maintenance face challenges in terms of storage and efficiency. In this article, we investigate distributed approaches to core maintenance on a pregel-like system, which is a famous graph computing system. We first design a core decomposition algorithm to obtain core numbers of vertices in a given graph. Based on it, a distributed batch-stream combined algorithm (DBCA) is devised to efficiently maintain the core numbers when vertex/edge updates happen. In particular, we introduce a new task assignment strategy to DBCA based on diversity of the edge-cores of updated edges. To ensure that DBCA can accurately process core maintenance, we develop a message interaction protocol to resolve the problem of crosstalk among different tasks. Comprehensive experiments have been conducted on real/synthetic graphs, more specifically, in two typical distributed environments built on Supercomputing Center and Alibaba Cloud. The experiment results demonstrate that our proposed algorithms are efficient and scalable.

[1]  Philip S. Yu,et al.  Continuous Monitoring of Maximum Clique Over Dynamic Graphs , 2022, IEEE Transactions on Knowledge and Data Engineering.

[2]  Jiguo Yu,et al.  Faster Parallel Core Maintenance Algorithms in Dynamic Graphs , 2020, IEEE Transactions on Parallel and Distributed Systems.

[3]  M. Tamer Özsu,et al.  Experimental Analysis of Distributed Graph Systems , 2018, Proc. VLDB Endow..

[4]  James Cheng,et al.  Architectural implications on the performance and cost of graph analytics systems , 2017, SoCC.

[5]  Yuanyuan Tian,et al.  Systems for Big Graph Analytics , 2017, SpringerBriefs in Computer Science.

[6]  Hai Jin,et al.  Core Maintenance in Dynamic Graphs: A Parallel Approach Based on Matching , 2017, IEEE Transactions on Parallel and Distributed Systems.

[7]  Hai Jin,et al.  Parallel Algorithms for Core Maintenance in Dynamic Graphs , 2016, ArXiv.

[8]  Rok Sosic,et al.  SNAP , 2016, ACM Trans. Intell. Syst. Technol..

[9]  Sabeur Aridhi,et al.  Distributed k-core decomposition and maintenance in large dynamic graphs , 2016, DEBS.

[10]  Jeffrey Xu Yu,et al.  A Fast Order-Based Approach for Core Maintenance , 2016, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[11]  Fan Yang,et al.  A General-Purpose Query-Centric Framework for Querying Big Graphs , 2016, Proc. VLDB Endow..

[12]  Tao Zhou,et al.  The H-index of a network node and its relation to degree and coreness , 2016, Nature Communications.

[13]  Claudio Martella,et al.  Practical Graph Analytics with Apache Giraph , 2015, Apress.

[14]  Alex Thomo,et al.  K-Core Decomposition of Large Networks on a Single PC , 2015, Proc. VLDB Endow..

[15]  Wilfred Ng,et al.  Effective Techniques for Message Reduction and Load Balancing in Distributed Graph Computation , 2015, WWW.

[16]  Yi Lu,et al.  Large-Scale Distributed Graph Computing Systems: An Experimental Evaluation , 2014, Proc. VLDB Endow..

[17]  Reynold Xin,et al.  GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.

[18]  Desh Ranjan,et al.  ParK: An efficient algorithm for k-core decomposition on multicore processors , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[19]  M. Tamer Özsu,et al.  An Experimental Comparison of Pregel-like Graph Processing Systems , 2014, Proc. VLDB Endow..

[20]  Lei Chen,et al.  Efficient cohesive subgraphs detection in parallel , 2014, SIGMOD Conference.

[21]  Tamara G. Kolda,et al.  Accelerating Community Detection by Using K-core Subgraphs , 2014, ArXiv.

[22]  Özgür Ulusoy,et al.  Distributed $k$ -Core View Materializationand Maintenance for Large Dynamic Graphs , 2014, IEEE Trans. Knowl. Data Eng..

[23]  Kun-Lung Wu,et al.  Streaming Algorithms for k-core Decomposition , 2013, Proc. VLDB Endow..

[24]  Jeffrey Xu Yu,et al.  Efficient Core Maintenance in Large Dynamic Graphs , 2012, IEEE Transactions on Knowledge and Data Engineering.

[25]  Joseph M. Hellerstein,et al.  Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..

[26]  Dimitrios M. Thilikos,et al.  D-cores: measuring collaboration of directed graphs based on degeneracy , 2011, Knowledge and Information Systems.

[27]  James Cheng,et al.  Efficient core decomposition in massive networks , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[28]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[29]  Xiaoli Li,et al.  Computational approaches for detecting protein complexes from protein interaction networks: a survey , 2010, BMC Genomics.

[30]  Alessandro Vespignani,et al.  Large scale networks fingerprinting and visualization using the k-core decomposition , 2005, NIPS.

[31]  Sergey N. Dorogovtsev,et al.  K-core Organization of Complex Networks , 2005, Physical review letters.

[32]  Alessandro Vespignani,et al.  K-core Decomposition: a Tool for the Visualization of Large Scale Networks , 2005, ArXiv.

[33]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[34]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[35]  Carlos Guestrin,et al.  PowerGraph : Distributed Graph-Parallel Computation on Natural Graphs , 2012 .

[36]  A. Barabasi,et al.  Emergence of Scaling in Random Networks , 1999 .

[37]  P. Erdos,et al.  On the evolution of random graphs , 1984 .