Brief Announcement: Semi-MapReduce Meets Congested Clique

Graph problems are troublesome when it comes to MapReduce. Typically, to be able to design algorithms that make use of the advantages of MapReduce, assumptions beyond what the model imposes, such as the {\em density} of the input graph, are required. In a recent shift, a simple and robust model of MapReduce for graph problems, where the space per machine is set to be $O(|V|)$ has attracted considerable attention. We term this model {\em semi-MapReduce}, or in short, semi-MPC, and focus on its computational power. In this short note, we show through a set of simulation methods that semi-MPC is, perhaps surprisingly, almost equivalent to the congested clique model of distributed computing. However, semi-MPC, in addition to round complexity, incorporates another practically important dimension to optimize: the number of machines. Furthermore, we show that algorithms in other distributed computing models, such as CONGEST, can be simulated to run in the same number of rounds of semiMPC while also using an optimal number of machines. We later show the implications of these simulation methods by obtaining improved algorithms for these models using the recent algorithms that have been developed.

[1]  Vahab S. Mirrokni,et al.  Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs , 2017, SODA.

[2]  Sriram V. Pemmaraju,et al.  Lessons from the Congested Clique Applied to MapReduce , 2014, SIROCCO.

[3]  Ronitt Rubinfeld,et al.  Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover , 2018, PODC.

[4]  Tomasz Jurdzinski,et al.  MST in O(1) Rounds of Congested Clique , 2018, SODA.

[5]  Joan Feigenbaum,et al.  On graph problems in a semi-streaming model , 2005, Theor. Comput. Sci..

[6]  Sergei Vassilvitskii,et al.  A model of computation for MapReduce , 2010, SODA '10.

[7]  Christoph Lenzen,et al.  Optimal deterministic routing and sorting on the congested clique , 2012, PODC '13.

[8]  Fabian Kuhn,et al.  On the power of the congested clique model , 2014, PODC.

[9]  Dan Suciu,et al.  Communication Steps for Parallel Query Processing , 2017, J. ACM.

[10]  Qin Zhang,et al.  Sorting, Searching, and Simulation in the MapReduce Framework , 2011, ISAAC.

[11]  Grigory Yaroslavtsev,et al.  Massively Parallel Algorithms and Hardness for Single-Linkage Clustering Under $\ell_p$-Distances , 2017, ICML.

[12]  Silvio Lattanzi,et al.  Filtering: a method for solving graph problems in MapReduce , 2011, SPAA '11.

[13]  Fred B. Chambers,et al.  Distributed Computing , 2016, Lecture Notes in Computer Science.

[14]  Krzysztof Onak,et al.  Round compression for parallel matching algorithms , 2017, STOC.

[15]  Silvio Lattanzi,et al.  Affinity Clustering: Hierarchical Clustering at Scale , 2017, NIPS.

[16]  Sergei Vassilvitskii,et al.  Shuffles and Circuits (On Lower Bounds for Modern Parallel Computation) , 2018, J. ACM.