On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage

We study online graph queries that retrieve nearby nodes of a query node from a large network. To answer such queries with high throughput and low latency, we partition the graph and process the data in parallel across a cluster of servers. State-of-the-art distributed graph querying systems place each graph partition on a separate server, where query answering over that partition takes place. This design has two major disadvantages. First, the router needs to maintain a fixed routing table. Hence, these systems are less flexible with respect to query routing, fault tolerance, and graph updates. Second, the graph data must be partitioned such that the workload across the servers is balanced, and the inter-machine communication is minimized. In addition, it is required to update the existing partitions based on workload changes over graph nodes. However, graph partitioning, online monitoring of workloads, and dynamically updating the graph partitions are expensive. In this work, we mitigate both these problems by decoupling graph storage from query processors, and by developing smart routing strategies that improve the cache locality in query processors. Since a query processor is no longer assigned any fixed part of the graph, it is equally capable of handling any request, thus facilitating load balancing and fault tolerance. On the other hand, due to our smart routing strategies, query processors can effectively leverage their cache contents, reducing the overall impact of how the graph is partitioned across storage servers. A detailed experimental evaluation with several real-world, large graph datasets demonstrates that our proposed framework, gRouting - even with simple hash partitioning of the data - achieves up to an order of magnitude better query throughput compared to existing graph querying systems that employ expensive graph partitioning and re-partitioning strategies.

[1]  Feifei Li,et al.  Scalable Multi-query Optimization for SPARQL , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[2]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[3]  Marlon Dumas,et al.  Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs , 2011, CIKM '11.

[4]  Sameh Elnikety,et al.  Horton+: A Distributed System for Processing Declarative Reachability Queries over Partitioned Graphs , 2013, Proc. VLDB Endow..

[5]  Ian Rae,et al.  F1: A Distributed SQL Database That Scales , 2013, Proc. VLDB Endow..

[6]  Dimitrios Tsoumakos,et al.  Graph-Aware, Workload-Adaptive SPARQL Query Caching , 2015, SIGMOD Conference.

[7]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[8]  Hong Cheng,et al.  Approximate Shortest Distance Computing: A Query-Dependent Local Landmark Scheme , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[9]  Haitao Zheng,et al.  Orion: Shortest Path Estimation for Large Social Graphs , 2010, WOSN.

[10]  汪卫 How to partition a billion-Node graph , 2014 .

[11]  Alexandros Labrinidis,et al.  Planar: Parallel lightweight architecture-aware adaptive graph repartitioning , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[12]  Parag Agrawal,et al.  The case for RAMClouds: scalable high-performance storage entirely in DRAM , 2010, OPSR.

[13]  Xintong Wang,et al.  Vivaldi : A Decentralized Network Coordinate System , 2016 .

[14]  Daniel J. Abadi,et al.  Scalable SPARQL querying of large RDF graphs , 2011, Proc. VLDB Endow..

[15]  Aoying Zhou,et al.  Workload-Aware Cache for Social Media Data , 2013, APWeb.

[16]  Özgür Ulusoy,et al.  Graph Aware Caching Policy for Distributed Graph Stores , 2015, 2015 IEEE International Conference on Cloud Engineering.

[17]  Amol Deshpande,et al.  EAGr: supporting continuous ego-centric aggregate queries over large dynamic graphs , 2014, SIGMOD Conference.

[18]  Haixun Wang,et al.  Trinity: a distributed graph engine on a memory cloud , 2013, SIGMOD '13.

[19]  Donald Kossmann,et al.  On the Design and Scalability of Distributed Shared-Data Databases , 2015, SIGMOD Conference.

[20]  Tony Tung,et al.  Scaling Memcache at Facebook , 2013, NSDI.

[21]  Jon M. Kleinberg,et al.  Triangulation and embedding using small sets of beacons , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[22]  Gabriel Kliot,et al.  Streaming graph partitioning for large distributed graphs , 2012, KDD.

[23]  Arun Sharma,et al.  Social Hash: An Assignment Framework for Optimizing Distributed Systems Operations on Social Networks , 2016, NSDI.

[24]  Panos Kalnis,et al.  Mizan: a system for dynamic load balancing in large-scale graph processing , 2013, EuroSys '13.

[25]  Guy E. Blelloch,et al.  Ligra: a lightweight graph processing framework for shared memory , 2013, PPoPP '13.

[26]  George Karypis,et al.  METIS and ParMETIS , 2011, Encyclopedia of Parallel Computing.

[27]  Carsten Binnig,et al.  The End of Slow Networks: It's Time for a Redesign , 2015, Proc. VLDB Endow..

[28]  Harvey Keitel,et al.  Using Structure Indices for Efficient Approximation of Network Properties , 2018 .

[29]  Xin Wang,et al.  Answering graph pattern queries using views , 2006, 2014 IEEE 30th International Conference on Data Engineering.

[30]  Takuya Akiba,et al.  Fast exact shortest-path distance queries on large networks by pruned landmark labeling , 2013, SIGMOD '13.

[31]  Willy Zwaenepoel,et al.  Chaos: scale-out graph processing from secondary storage , 2015, SOSP.

[32]  Daniel J. Abadi,et al.  LEOPARD: Lightweight Edge-Oriented Partitioning and Replication for Dynamic Graphs , 2016, Proc. VLDB Endow..

[33]  Amol Deshpande,et al.  Managing large dynamic graphs efficiently , 2012, SIGMOD Conference.

[34]  Félix Cuadrado,et al.  Adaptive Partitioning for Large-Scale Dynamic Graphs , 2013, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[35]  Bo Zong,et al.  Towards effective partition management for large graphs , 2012, SIGMOD Conference.

[36]  Lei Chen,et al.  Hermes: Dynamic Partitioning for Distributed Social Network Graph Databases , 2015, EDBT.

[37]  Pablo Rodriguez,et al.  The little engine(s) that could: scaling online social networks , 2012, TNET.