Parallel Graph Decomposition and Diameter Approximation in o(Diameter) Time and Linear Space

We present the first parallel (MapReduce) algorithm to appro ximate the diameter of large graphs through graph decomposition which requires a number of parallel rounds that is sub-linear in the diameter and total space linear in the graph size. The quality of t he diameter approximation is expressed in terms of the doubling dimension of the graph and is polylogarithmic when the graph has a constant doubling dimension. Extensive experiments demonstrate the effectiveness and efficiency of our approach on large graphs.

[1]  Marco Rosa,et al.  HyperANF: approximating the neighbourhood function of very large graphs on a budget , 2010, WWW.

[2]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[3]  Sergei Vassilvitskii,et al.  A model of computation for MapReduce , 2010, SODA '10.

[4]  Benjamin Moseley,et al.  Fast clustering using MapReduce , 2011, KDD.

[5]  Philip N. Klein,et al.  A parallel randomized approximation scheme for shortest paths , 1992, STOC '92.

[6]  Roberto Grossi,et al.  On computing the diameter of real-world undirected graphs , 2013, Theor. Comput. Sci..

[7]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[8]  Giri Narasimhan,et al.  Fast algorithms for constructing t-spanners and paths with stretch t , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[9]  Sandeep Sen,et al.  A simple and linear time randomized algorithm for computing sparse spanners in weighted graphs , 2007, Random Struct. Algorithms.

[10]  Ana Paula Appel,et al.  HADI: Mining Radii of Large Graphs , 2011, TKDD.

[11]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[12]  Ulrich Meyer,et al.  I/O-efficient Hierarchical Diameter Approximation , 2012, ESA.

[13]  Ulrich Meyer On Trade-Offs in External-Memory Diameter-Approximation , 2008, SWAT.

[14]  Gary L. Miller,et al.  Parallel graph decompositions using random shifts , 2013, SPAA.

[15]  Christos Faloutsos,et al.  ANF: a fast and scalable tool for data mining in massive graphs , 2002, KDD.

[16]  Eli Upfal,et al.  Space-round tradeoffs for MapReduce computations , 2011, ICS '12.

[17]  Robert E. Tarjan,et al.  Better Approximation Algorithms for the Graph Diameter , 2014, SODA.

[18]  David R. Karger,et al.  Finding nearest neighbors in growth-restricted metrics , 2002, STOC '02.

[19]  Polylog-time and near-linear work approximation scheme for undirected shortest paths , 2000, JACM.

[20]  Mihalis Yannakakis,et al.  High-probability parallel transitive closure algorithms , 1990, SPAA '90.

[21]  Teofilo F. GONZALEZ,et al.  Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[22]  David Peleg,et al.  Forbidden-Set Distance Labels for Graphs of Bounded Doubling Dimension , 2010, PODC.

[23]  Sandeep Sen,et al.  A simple and linear time randomized algorithm for computing sparse spanners in weighted graphs , 2007 .