Toward Optimal Bounds in the Congested Clique: Graph Connectivity and MST

We study two fundamental graph problems, Graph Connectivity (GC) and Minimum Spanning Tree (MST), in the well-studied Congested Clique model, and present several new bounds on the time and message complexities of randomized algorithms for these problems. No non-trivial (i.e., super-constant) time lower bounds are known for either of the aforementioned problems; in particular, an important open question is whether or not constant-round algorithms exist for these problems. We make progress toward answering this question by presenting randomized Monte Carlo algorithms for both problems that run in O(log log log n) rounds (where n is the size of the clique). Our results improve by an exponential factor on the long-standing (deterministic) time bound of O(log log n) rounds for these problems due to Lotker et al. (SICOMP 2005). Our algorithms make use of several algorithmic tools including graph sketching, random sampling, and fast sorting. The second contribution of this paper is to present several almost-tight bounds on the message complexity of these problems. Specifically, we show that Ω(n2) messages are needed by any algorithm (including randomized Monte Carlo algorithms, and regardless of the number of rounds) that solves the GC (and hence also the MST) problem if each machine in the Congested Clique has initial knowledge only of itself (the so-called KT0 model). In contrast, if the machines have initial knowledge of their neighbors' IDs (the so-called KT1 model), we present a randomized Monte Carlo algorithm for MST that uses O(n polylog n) messages and runs in O(polylog n) rounds. To complement this, we also present a lower bound in the KT1 model that shows that Ω(n) messages are required by any algorithm that solves GC, regardless of the number of rounds used. Our results are a step toward understanding the power of randomization in the Congested Clique with respect to both time and message complexity.

[1]  Boaz Patt-Shamir,et al.  The round complexity of distributed sorting: extended abstract , 2011, PODC '11.

[2]  Fabian Kuhn,et al.  On the power of the congested clique model , 2014, PODC.

[3]  CormodeGraham,et al.  A unifying framework for ℓ0-sampling algorithms , 2014 .

[4]  Hartmut Klauck,et al.  Distributed Computation of Large-scale Graph Problems , 2015, SODA.

[5]  Baruch Awerbuch,et al.  A trade-off between information and communication in broadcast protocols , 1990, JACM.

[6]  Christoph Lenzen,et al.  Optimal deterministic routing and sorting on the congested clique , 2012, PODC '13.

[7]  Christoph Lenzen,et al.  "Tri, Tri Again": Finding Triangles and Small Subgraphs in a Distributed Setting - (Extended Abstract) , 2012, DISC.

[8]  Bruce M. Kapron,et al.  Dynamic graph connectivity in polylogarithmic worst case time , 2013, SODA.

[9]  Sudipto Guha,et al.  Graph sketches: sparsification, spanners, and subgraphs , 2012, PODS.

[10]  Boaz Patt-Shamir,et al.  The Round Complexity of Distributed Sorting , 2011 .

[11]  Philip N. Klein,et al.  A randomized linear-time algorithm to find minimum spanning trees , 1995, JACM.

[12]  Andrew McGregor,et al.  Graph stream algorithms: a survey , 2014, SGMD.

[13]  Christoph Lenzen,et al.  Algebraic methods in the congested clique , 2015, Distributed Computing.

[14]  Graham Cormode,et al.  A unifying framework for ℓ0-sampling algorithms , 2013, Distributed and Parallel Databases.

[15]  Rajeev Motwani,et al.  Randomized Algorithms , 1995, SIGA.

[16]  Stephan Holzer,et al.  Approximation of Distances and Shortest Paths in the Broadcast Congest Clique , 2014, OPODIS.

[17]  David P. Woodruff,et al.  When distributed computation is communication expensive , 2013, Distributed Computing.

[18]  Sriram V. Pemmaraju,et al.  Near-Constant-Time Distributed Algorithms on a Congested Clique , 2014, DISC.

[19]  Andrew Berns,et al.  Super-Fast Distributed Algorithms for Metric Facility Location , 2012, ICALP.

[20]  Hossein Jowhari,et al.  Tight bounds for Lp samplers, finding duplicates in streams, and related problems , 2010, PODS.

[21]  Boaz Patt-Shamir,et al.  Minimum-Weight Spanning Tree Construction in O(log log n) Communication Rounds , 2005, SIAM J. Comput..

[22]  Noga Alon,et al.  Space-efficient local computation algorithms , 2011, SODA.

[23]  Fabian Kuhn,et al.  The communication complexity of distributed task allocation , 2012, PODC '12.

[24]  Ben H. H. Juurlink,et al.  Communication-optimal parallel minimum spanning tree algorithms (extended abstract) , 1998, SPAA '98.

[25]  David Peleg,et al.  Distributed Computing: A Locality-Sensitive Approach , 1987 .

[26]  Danupon Nanongkai,et al.  Distributed approximation algorithms for weighted shortest paths , 2014, STOC.

[27]  Peter Robinson,et al.  Almost Optimal Distributed Algorithms for Large-Scale Graph Problems , 2015, ArXiv.

[28]  Sriram V. Pemmaraju,et al.  Lessons from the Congested Clique Applied to MapReduce , 2014, SIROCCO.

[29]  Sudipto Guha,et al.  Analyzing graph structure via linear measurements , 2012, SODA.

[30]  Micah Adler,et al.  Communication-optimal Parallel Minimum Spanning Tree Algorithms , 1998, SPAA 1998.

[31]  Joachim Gehweiler,et al.  A Distributed O(1)-Approximation Algorithm for the Uniform Facility Location Problem , 2006, SPAA '06.