Distributed Weighted Matching via Randomized Composable Coresets

Maximum weight matching is one of the most fundamental combinatorial optimization problems with a wide range of applications in data mining and bioinformatics. Developing distributed weighted matching algorithms is challenging due to the sequential nature of efficient algorithms for this problem. In this paper, we develop a simple distributed algorithm for the problem on general graphs with approximation guarantee of $2+\varepsilon$ that (nearly) matches that of the sequential greedy algorithm. A key advantage of this algorithm is that it can be easily implemented in only two rounds of computation in modern parallel computation frameworks such as MapReduce. We also demonstrate the efficiency of our algorithm in practice on various graphs (some with half a trillion edges) by achieving objective values always close to what is achievable in the centralized setting.

[1]  Bruce Randall Donald,et al.  High-Throughput 3D Structural Homology Detection via NMR Resonance Assignment , 2004 .

[2]  Ran Duan,et al.  Approximating Maximum Weight Matching in Near-Linear Time , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[3]  SaberiAmin,et al.  AdWords and generalized online matching , 2007 .

[4]  Stefan Hougardy,et al.  A linear-time approximation algorithm for weighted matchings in graphs , 2005, TALG.

[5]  Michael Crouch,et al.  Improved Streaming Algorithms for Weighted Matching, via Unweighted Matching , 2014, APPROX-RANDOM.

[6]  Robert Preis,et al.  Linear Time 1/2-Approximation Algorithm for Maximum Weighted Matching in General Graphs , 1999, STACS.

[7]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[8]  Morteza Zadimoghaddam,et al.  Randomized Composable Core-sets for Distributed Submodular Maximization , 2015, STOC.

[9]  Qin Zhang,et al.  Communication Complexity of Approximate Matching in Distributed Graphs , 2015, STACS.

[10]  Silvio Lattanzi,et al.  Filtering: a method for solving graph problems in MapReduce , 2011, SPAA '11.

[11]  KumarRavi,et al.  Fast Greedy Algorithms in MapReduce and Streaming , 2015 .

[12]  Yang Li,et al.  On Estimating Maximum Matching Size in Graph Streams , 2017, SODA.

[13]  Nicholas J. A. Harvey,et al.  Greedy and Local Ratio Algorithms in the MapReduce Model , 2018, SPAA.

[14]  Robert E. Tarjan,et al.  Faster scaling algorithms for general graph matching problems , 1991, JACM.

[15]  Zvi Galil,et al.  Efficient implementation of graph algorithms using contraction , 1984, JACM.

[16]  Claire Mathieu,et al.  Maximum Matching in Semi-streaming with Few Passes , 2011, APPROX-RANDOM.

[17]  Ola Svensson,et al.  Weighted Matchings via Unweighted Augmentations , 2018, PODC.

[18]  Alex Pothen,et al.  Computing the block triangular form of a sparse matrix , 1990, TOMS.

[19]  Ronitt Rubinfeld,et al.  Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover , 2018, PODC.

[20]  Ashish Goel,et al.  On the communication and streaming complexity of maximum bipartite matching , 2012, SODA.

[21]  Ami Paz,et al.  A (2 + ∊)-Approximation for Maximum Weight Matching in the Semi-Streaming Model , 2017, SODA.

[22]  Bonnie Berger,et al.  Graph algorithms for biological systems analysis , 2008, SODA '08.

[23]  Ariel D. Procaccia,et al.  Optimizing kidney exchange with transplant chains: theory and reality , 2012, AAMAS.

[24]  Sudipto Guha,et al.  Linear programming in the semi-streaming model with application to the maximum matching problem , 2011, Inf. Comput..

[25]  Sergei Vassilvitskii,et al.  A model of computation for MapReduce , 2010, SODA '10.

[26]  Iain S. Duff,et al.  On Algorithms For Permuting Large Entries to the Diagonal of a Sparse Matrix , 2000, SIAM J. Matrix Anal. Appl..

[27]  Silvio Lattanzi,et al.  Affinity Clustering: Hierarchical Clustering at Scale , 2017, NIPS.

[28]  Ran Duan,et al.  Scaling Algorithms for Weighted Matching in General Graphs , 2014, SODA.

[29]  Ola Svensson,et al.  Online Matching with General Arrivals , 2019, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[30]  Harold N. Gabow,et al.  A scaling algorithm for weighted matching on general graphs , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[31]  Krzysztof Onak,et al.  Round compression for parallel matching algorithms , 2017, STOC.

[32]  Huy L. Nguyen,et al.  A New Framework for Distributed Submodular Maximization , 2015, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[33]  Qin Zhang,et al.  Sorting, Searching, and Simulation in the MapReduce Framework , 2011, ISAAC.

[34]  Peter Sanders,et al.  A simpler linear time 2/3-epsilon approximation for maximum weight matching , 2004, Inf. Process. Lett..

[35]  Moshe Tennenholtz,et al.  Constrained multi-object auctions and b-matching , 2000, Inf. Process. Lett..

[36]  Mikhail Kapralov,et al.  Better bounds for matchings in the streaming model , 2012, SODA.

[37]  Nima Reyhani,et al.  Almost Optimal Stochastic Weighted Matching with Few Queries , 2018, EC.

[38]  Harold N. Gabow,et al.  Data structures for weighted matching and nearest common ancestors with linking , 1990, SODA '90.

[39]  Yang Li,et al.  Maximum Matchings in Dynamic Graph Streams and the Simultaneous Communication Model , 2016, SODA.

[40]  Nikhil R. Devanur,et al.  Fast algorithms for finding matchings in lopsided bipartite graphs with applications to display ads , 2010, EC '10.

[41]  Baruch Awerbuch,et al.  A Distributed Algorithm for Large-Scale Generalized Matching , 2013, Proc. VLDB Endow..

[42]  Jan Vondrák,et al.  Submodular Optimization in the MapReduce Model , 2018, SOSA.

[43]  Leah Epstein,et al.  Improved Approximation Guarantees for Weighted Matching in the Semi-streaming Model , 2009, SIAM J. Discret. Math..

[44]  Harold N. Gabow,et al.  An Efficient Implementation of Edmonds' Algorithm for Maximum Matching on Graphs , 1976, JACM.

[45]  Piotr Sankowski,et al.  Algorithmic Applications of Baur-Strassen's Theorem: Shortest Cycles, Diameter and Matchings , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[46]  Sepehr Assadi,et al.  Randomized Composable Coresets for Matching and Vertex Cover , 2017, SPAA.

[47]  Andrew McGregor,et al.  Finding Graph Matchings in Data Streams , 2005, APPROX-RANDOM.

[48]  Shih-Fu Chang,et al.  Graph construction and b-matching for semi-supervised learning , 2009, ICML '09.

[49]  J. Edmonds Paths, Trees, and Flowers , 1965, Canadian Journal of Mathematics.

[50]  Soheil Behnezhad,et al.  Brief Announcement: Graph Matching in Massive Datasets , 2017, SPAA.

[51]  Seth Pettie,et al.  Linear-Time Approximation for Maximum Weight Matching , 2014, JACM.

[52]  Huy L. Nguyen,et al.  The Power of Randomization: Distributed Submodular Maximization on Massive Datasets , 2015, ICML.

[53]  Edmond Chow,et al.  Combinatorial Algorithms for Computing Column Space Bases That Have Sparse Inverses , 2005 .

[54]  Vahab S. Mirrokni,et al.  Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs , 2017, SODA.

[55]  Sergei Vassilvitskii,et al.  Fast greedy algorithms in mapreduce and streaming , 2013, SPAA.

[56]  Jack Edmonds,et al.  Maximum matching and a polyhedron with 0,1-vertices , 1965 .

[57]  Sudipto Guha,et al.  Analyzing graph structure via linear measurements , 2012, SODA.

[58]  Béla Bollobás,et al.  Random Graphs , 1985 .

[59]  Ariel D. Procaccia,et al.  Ignorance is Almost Bliss: Near-Optimal Stochastic Matching With Few Queries , 2014, EC.

[60]  Iain S. Duff,et al.  The Design and Use of Algorithms for Permuting Large Entries to the Diagonal of Sparse Matrices , 1999, SIAM J. Matrix Anal. Appl..