Matching in Bipartite Graph Streams in a Small Number of Passes

We consider the maximum-cardinality matching problem in bipartite graphs. The input graph G = (V,E) is not available for random access, but only as a stream, and random-access memory is limited to storing Θ(n) edges at a time, n = |V|. The number of passes over the input stream required to achieve the desired approximation is an important measure. It was shown by Eggert et al. (2009, 2011) that a 1+1/k approximation can be computed in O(k5) passes, independently of the input size. In this work, we present a new algorithm with the same approximation guarantee of 1+1/k, but show experimentally that it requires two orders of magnitude fewer passes. The proven bound on the number of passes is O(kn). This bound depends on the input size, and so in principle is inferior to O(k5). But we emphasize that in experiments, we do not find any correlation between theoretical bounds and actual performance: for all algorithms the number of passes observed in experiments is far below the corresponding theoretical bound. The most interesting insight comes from an experimental comparison of the previous and the new algorithm: e.g., for k = 9, the new one never needed more than 94 passes, even for instances with up to 2 × 106 vertices, whereas the previous one went up to more than 32 000 passes. Our main new technique is aimed at making the most out of each pass: we maintain a complex structure, using trees, for building augmenting paths.