Graph Sparsification for Derandomizing Massively Parallel Computation with Low Space

The Massively Parallel Computation (MPC) model is an emerging model that distills core aspects of distributed and parallel computation, developed as a tool to solve combinatorial (typically graph) problems in systems of many machines with limited space. Recent work has focused on the regime in which machines have sublinear (in n, the number of nodes in the input graph) space, with randomized algorithms presented for the fundamental problems of Maximal Matching and Maximal Independent Set. However, there have been no prior corresponding deterministic algorithms. A major challenge underlying the sublinear space setting is that the local space of each machine might be too small to store all edges incident to a single node. This poses a considerable obstacle compared to classical models in which each node is assumed to know and have easy access to its incident edges. To overcome this barrier, we introduce a new graph sparsification technique that deterministically computes a low-degree subgraph, with the additional property that solving the problem on this subgraph provides significant progress towards solving the problem for the original input graph. Using this framework to derandomize the well-known algorithm of Luby [SICOMP’86], we obtain O(log Δ + log log n)-round deterministic MPC algorithms for solving the problems of Maximal Matching and Maximal Independent Set with O(nɛ) space on each machine for any constant ɛ > 0. These algorithms also run in O(log Δ) rounds in the closely related model of CONGESTED CLIQUE, improving upon the state-of-the-art bound of O(log 2Δ) rounds by Censor-Hillel et al. [DISC’17].

[1]  Peter Davies,et al.  Simple, Deterministic, Constant-Round Coloring in the Congested Clique , 2020, PODC.

[2]  Mohsen Ghaffari,et al.  Improved MPC Algorithms for MIS, Matching, and Coloring on Trees and Beyond , 2020, DISC.

[3]  Fabian Kuhn,et al.  Efficient Deterministic Distributed Coloring with Small Bandwidth , 2020, PODC.

[4]  Alexandr Andoni,et al.  Parallel approximate undirected shortest paths via low hop emulators , 2019, STOC.

[5]  Krzysztof Onak,et al.  Walking randomly, massively, and efficiently , 2019, STOC.

[6]  Gregory Schwartzman,et al.  Derandomizing local distributed algorithms under bandwidth restrictions , 2016, Distributed Computing.

[7]  Fabian Kuhn,et al.  Conditional Hardness Results for Massively Parallel Computation from Distributed Lower Bounds , 2019, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[8]  Vahab S. Mirrokni,et al.  Near-Optimal Massively Parallel Graph Connectivity , 2019, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[9]  Richard M. Karp,et al.  Massively Parallel Computation of Matching and MIS in Sparse Graphs , 2019, PODC.

[10]  Fabian Kuhn,et al.  Deterministic Distributed Dominating Set Approximation in the CONGEST Model , 2019, PODC.

[11]  Vahab S. Mirrokni,et al.  Massively Parallel Computation via Remote Memory Access , 2019, SPAA.

[12]  Mohammad Taghi Hajiaghayi,et al.  Exponentially Faster Massively Parallel Maximal Matching , 2019, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[13]  Yufan Zheng,et al.  The Complexity of (Δ+1) Coloring in Congested Clique, Massively Parallel Computation, and Centralized Local Computation , 2018, PODC.

[14]  Mohsen Ghaffari,et al.  Sparsifying Distributed Algorithms with Ramifications in Massively Parallel Computation and Centralized Local Computation , 2018, SODA.

[15]  Sepehr Assadi,et al.  Massively Parallel Algorithms for Finding Well-Connected Components in Sparse Graphs , 2018, PODC.

[16]  Vahab S. Mirrokni,et al.  Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs , 2017, SODA.

[17]  Krzysztof Onak Round Compression for Parallel Graph Algorithms in Strongly Sublinear Space , 2018, ArXiv.

[18]  David G. Harris Deterministic Parallel Algorithms for Bilinear Objective Functions , 2018, Algorithmica.

[19]  Eylon Yogev,et al.  Congested Clique Algorithms for Graph Spanners , 2018, DISC.

[20]  Alexandr Andoni,et al.  Parallel Graph Connectivity in Log Diameter Rounds , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[21]  Mohammad Taghi Hajiaghayi,et al.  Brief Announcement: Semi-MapReduce Meets Congested Clique , 2018, ArXiv.

[22]  Ronitt Rubinfeld,et al.  Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover , 2018, PODC.

[23]  Fabian Kuhn,et al.  On Derandomizing Local Distributed Algorithms , 2017, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[24]  Krzysztof Onak,et al.  Round compression for parallel matching algorithms , 2017, STOC.

[25]  David G. Harris Derandomized Concentration Bounds for Polynomials, and Hypergraph Maximal Independent Set , 2016, SODA.

[26]  Fabian Kuhn,et al.  Derandomizing Distributed Algorithms with Small Messages: Spanners and Dominating Set , 2018, DISC.

[27]  Merav Parter (Δ+1) Coloring in the Congested Clique Model , 2018, ArXiv.

[28]  Sergei Vassilvitskii,et al.  Shuffles and Circuits: (On Lower Bounds for Modern Parallel Computation) , 2016, SPAA.

[29]  Mohsen Ghaffari,et al.  An Improved Distributed Algorithm for Maximal Independent Set , 2015, SODA.

[30]  Alexandr Andoni,et al.  Parallel algorithms for geometric graph problems , 2013, STOC.

[31]  Paraschos Koutris,et al.  Communication steps for parallel query processing , 2013, PODS '13.

[32]  Christoph Lenzen,et al.  Optimal deterministic routing and sorting on the congested clique , 2012, PODC '13.

[33]  Silvio Lattanzi,et al.  Filtering: a method for solving graph problems in MapReduce , 2011, SPAA '11.

[34]  Qin Zhang,et al.  Sorting, Searching, and Simulation in the MapReduce Framework , 2011, ISAAC.

[35]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[36]  Sergei Vassilvitskii,et al.  A model of computation for MapReduce , 2010, SODA '10.

[37]  Fabian Kuhn Weak graph colorings: distributed algorithms and applications , 2009, SPAA '09.

[38]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[39]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[40]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[41]  Boaz Patt-Shamir,et al.  MST construction in O(log log n) communication rounds , 2003, SPAA '03.

[42]  Michael T. Goodrich,et al.  Communication-Efficient Parallel Sorting , 1999, SIAM J. Comput..

[43]  Yijie Han A Fast Derandomization Scheme and Its Applications , 1996, SIAM J. Comput..

[44]  Mihir Bellare,et al.  Randomness-efficient oblivious sampling , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[45]  Nathan Linial,et al.  Locality in Distributed Graph Algorithms , 1992, SIAM J. Comput..

[46]  Mark K. Goldberg,et al.  A new parallel algorithm for the maximal independent set problem , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[47]  David Peleg,et al.  Distributed Computing: A Locality-Sensitive Approach , 1987 .

[48]  Alon Itai,et al.  A Fast and Simple Randomized Parallel Algorithm for Maximal Matching , 1986, Inf. Process. Lett..

[49]  Noga Alon,et al.  A Fast and Simple Randomized Parallel Algorithm for the Maximal Independent Set Problem , 1985, J. Algorithms.

[50]  Michael Luby,et al.  A simple parallel algorithm for the maximal independent set problem , 1985, STOC '85.

[51]  Richard M. Karp,et al.  A fast parallel algorithm for the maximal independent set problem , 1984, STOC '84.

[52]  Lecture Notes for Randomized Algorithms Luby ’ s Alg . for Maximal Independent Sets using Pairwise Independence , .