SDP: Scalable Real-Time Dynamic Graph Partitioner

Time-evolving large graph has received attention due to their participation in real-world applications such as social networks and PageRank calculation. It is necessary to partition a large-scale dynamic graph in a streaming manner to overcome the memory bottleneck while partitioning the computational load. Reducing network communication and balancing the load between the partitions are the criteria to achieve effective run-time performance in graph partitioning. Moreover, an optimal resource allocation is needed to utilise the resources while catering the graph streams into the partitions. A number of existing partitioning algorithms (ADP, LogGP and LEOPARD) have been proposed to address the above problem. However, these partitioning methods are incapable of scaling the resources and handling the stream of data in real-time. In this study, we propose a dynamic graph partitioning method called Scalable Dynamic Graph Partitioner (SDP) using the streaming partitioning technique. The SDP contributes a novel vertex assigning method, communication-aware balancing method, and a scaling technique to produce an efficient dynamic graph partitioner. Experiment results show that the proposed method achieves up to 90% reduction of communication cost and 60%-70% balancing the load dynamically, compared with previous algorithms. Moreover, the proposed algorithm significantly reduces the execution time during partitioning.

[1]  Martin G. Everett,et al.  Parallel Dynamic Graph Partitioning for Adaptive Unstructured Meshes , 1997, J. Parallel Distributed Comput..

[2]  Pablo Rodriguez,et al.  The little engine(s) that could: scaling online social networks , 2010, SIGCOMM '10.

[3]  Amir H. Payberah,et al.  JA-BE-JA: A Distributed Algorithm for Balanced Graph Partitioning , 2013, 2013 IEEE 7th International Conference on Self-Adaptive and Self-Organizing Systems.

[4]  Enhong Chen,et al.  Kineograph: taking the pulse of a fast-changing and connected world , 2012, EuroSys '12.

[5]  Félix Cuadrado,et al.  Adaptive Partitioning for Large-Scale Dynamic Graphs , 2013, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[6]  David A. Bader,et al.  Massive streaming data analytics: a graph-based approach , 2013, XRDS.

[7]  Gabriel Kliot,et al.  Streaming graph partitioning for large distributed graphs , 2012, KDD.

[8]  Charalampos E. Tsourakakis Streaming Graph Partitioning in the Planted Partition Model , 2014, COSN.

[9]  Daniel J. Abadi,et al.  LEOPARD: Lightweight Edge-Oriented Partitioning and Replication for Dynamic Graphs , 2016, Proc. VLDB Endow..

[10]  Alexander J. Smola,et al.  Distributed large-scale natural graph factorization , 2013, WWW.

[11]  Fabio Petroni,et al.  HDRF: Stream-Based Partitioning for Power-Law Graphs , 2015, CIKM.

[12]  Amol Deshpande,et al.  Managing large dynamic graphs efficiently , 2012, SIGMOD Conference.

[13]  Sergei Vassilvitskii,et al.  Sharding social networks , 2013, WSDM.

[14]  Bo Zong,et al.  Towards effective partition management for large graphs , 2012, SIGMOD Conference.

[15]  Yu Gu,et al.  TSH: Easy-to-be distributed partitioning for large-scale graphs , 2019, Future Gener. Comput. Syst..

[16]  Charalampos E. Tsourakakis,et al.  FENNEL: streaming graph partitioning for massive scale graphs , 2014, WSDM.

[17]  Ning Xu,et al.  LogGP: A Log-based Dynamic Graph Partitioning Method , 2014, Proc. VLDB Endow..

[18]  Saurabh Kumar Garg,et al.  Window-based Streaming Graph Partitioning Algorithm , 2019, ACSW.

[19]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[21]  Amir H. Payberah,et al.  Boosting Vertex-Cut Partitioning for Streaming Graphs , 2016, 2016 IEEE International Congress on Big Data (BigData Congress).

[22]  Félix Cuadrado,et al.  xDGP: A Dynamic Graph Processing System with Adaptive Partitioning , 2013, ArXiv.

[23]  Paolo Missier,et al.  Workload-aware Streaming Graph Partitioning , 2016, EDBT/ICDT Workshops.

[24]  Lakshmish Ramaswamy,et al.  Continual and Cost-Effective Partitioning of Dynamic Graphs for Optimizing Big Graph Processing Systems , 2016, 2016 IEEE International Congress on Big Data (BigData Congress).

[25]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .