Dynamic Elasticity for Distributed Graph Analytics

Graph data analytics has received a great deal of attention in computing theory and systems research over the past decade. This paper proposes, implements and evaluates a distributed graph analytics system that scales resources up and down elastically to address the challenge of dynamic workload changes in analytics jobs. We show that by relying on hash partitioning, a simple and scalable method, and dynamically changing the placement of partitions from the already-partitioned graph, we can improve performance. Hence, the system eliminates the need to burden the user with resource acquisition and management decisions. We compare the system’s performance against a static partition placement to set expectations for real-world applications. Early evaluations show our system can achieve up to 2.4x speedup for certain applications.

[1]  Claudio Martella,et al.  Spinner: Scalable Graph Partitioning in the Cloud , 2014, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[2]  Alexandros Labrinidis,et al.  CE-Storm: Confidential Elastic Processing of Data Streams , 2015, SIGMOD Conference.

[3]  Yogesh L. Simmhan,et al.  Elastic Partition Placement for Non-stationary Graph Algorithms , 2016, 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid).

[4]  Jeffrey Xu Yu,et al.  Catch the Wind: Graph workload balancing on cloud , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[5]  Avery Ching,et al.  One Trillion Edges: Graph Processing at Facebook-Scale , 2015, Proc. VLDB Endow..

[6]  Ning Xu,et al.  LogGP: A Log-based Dynamic Graph Partitioning Method , 2014, Proc. VLDB Endow..

[7]  Michael Isard,et al.  Scalability! But at what COST? , 2015, HotOS.

[8]  Luke M. Leslie,et al.  Supporting On-demand Elasticity in Distributed Graph Processing , 2016, 2016 IEEE International Conference on Cloud Engineering (IC2E).

[9]  Tim Weninger,et al.  Thinking Like a Vertex , 2015, ACM Comput. Surv..