Set Cover in Sub-linear Time

We study the classic set cover problem from the perspective of sub-linear algorithms. Given access to a collection of m sets over n elements in the query model, we show that sub-linear algorithms derived from existing techniques have almost tight query complexities. On one hand, first we show an adaptation of the streaming algorithm presented in [17] to the sub-linear query model, that returns an α-approximate cover using O(m(n/k)1/(α−1) + nk) queries to the input, where k denotes the value of a minimum set cover. We then complement this upper bound by proving that for lower values of k, the required number of queries is [EQUATION], even for estimating the optimal cover size. Moreover, we prove that even checking whether a given collection of sets covers all the elements would require Ω(nk) queries. These two lower bounds provide strong evidence that the upper bound is almost tight for certain values of the parameter k. On the other hand, we show that this bound is not optimal for larger values of the parameter k, as there exists a (1 + e)-approximation algorithm with O(mn/ke2) queries. We show that this bound is essentially tight for sufficiently small constant ϵ, by establishing a lower bound of [EQUATION] query complexity. Our lower-bound results follow by carefully designing two distributions of instances that are hard to distinguish. In particular, our first lower bound involves a probabilistic construction of a certain set system with a minimum set cover of size αk, with the key property that a small number of "almost uniformly distributed" modifications can reduce the minimum set cover size down to k. Thus, these modifications are not detectable unless a large number of queries are asked. We believe that our probabilistic construction technique might find applications to lower bounds for other combinatorial optimization problems.

[1]  David Steurer,et al.  Analytical approach to parallel repetition , 2013, STOC.

[2]  Morteza Zadimoghaddam,et al.  Randomized Composable Core-sets for Distributed Submodular Maximization , 2015, STOC.

[3]  Roger Wattenhofer,et al.  The price of being near-sighted , 2006, SODA '06.

[4]  Lise Getoor,et al.  On Maximum Coverage in the Streaming Model & Application to Multi-topic Blog-Watch , 2009, SDM.

[5]  Vahab S. Mirrokni,et al.  Distributed Coverage Maximization via Sketching , 2016, ArXiv.

[6]  Ashish Goel,et al.  Perfect Matchings in O(nlog n) Time in Regular Bipartite Graphs , 2013, SIAM J. Comput..

[7]  Dana Moshkovitz,et al.  The Projection Games Conjecture and the NP-Hardness of ln n-Approximating Set-Cover , 2012, Theory Comput..

[8]  Leonid Khachiyan,et al.  A sublinear-time randomized approximation algorithm for matrix games , 1995, Oper. Res. Lett..

[9]  Dana Ron,et al.  On Approximating the Minimum Vertex Cover in Sublinear Time and the Connection to Distributed Algorithms , 2007, Electron. Colloquium Comput. Complex..

[10]  Ran Raz,et al.  A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP , 1997, STOC '97.

[11]  Piotr Indyk,et al.  Towards Tight Bounds for the Streaming Set Cover Problem , 2015, PODS.

[12]  Sergei Vassilvitskii,et al.  Fast Greedy Algorithms in MapReduce and Streaming , 2015, ACM Trans. Parallel Comput..

[13]  Ashish Goel,et al.  Perfect matchings in o(n log n) time in regular bipartite graphs , 2009, STOC '10.

[14]  Noga Alon,et al.  Algorithmic construction of sets for k-restrictions , 2006, TALG.

[15]  Russ Bubley,et al.  Randomized algorithms , 1995, CSUR.

[16]  Krzysztof Onak,et al.  A near-optimal sublinear-time algorithm for approximating the minimum vertex cover size , 2011, SODA.

[17]  Yang Li,et al.  Tight bounds for single-pass streaming complexity of the set cover problem , 2016, STOC.

[18]  Charalampos E. Tsourakakis,et al.  Space- and Time-Efficient Algorithm for Maintaining Dense Subgraphs on One-Pass Dynamic Streams , 2015, STOC.

[19]  Sepehr Assadi,et al.  Tight Space-Approximation Tradeoff for the Multi-Pass Streaming Set Cover Problem , 2017, PODS.

[20]  Dana Ron,et al.  Distance Approximation in Bounded-Degree and General Sparse Graphs , 2006, APPROX-RANDOM.

[21]  Vahab S. Mirrokni,et al.  Almost Optimal Streaming Algorithms for Coverage Problems , 2016, SPAA.

[22]  Andrew McGregor,et al.  Better Streaming Algorithms for the Maximum Coverage Problem , 2018, Theory of Computing Systems.

[23]  T. Grossman,et al.  Computational Experience with Approximation Algorithms for the Set Covering Problem , 1994 .

[24]  Piotr Indyk,et al.  On Streaming and Communication Complexity of the Set Cover Problem , 2014, DISC.

[25]  Krzysztof Onak,et al.  Constant-Time Approximation Algorithms via Local Improvements , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[26]  Adi Rosén,et al.  Semi-Streaming Set Cover , 2014, ACM Trans. Algorithms.

[27]  Ravi Kumar,et al.  Max-cover in map-reduce , 2010, WWW '10.

[28]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[29]  Christos Koufogiannakis,et al.  A Nearly Linear-Time PTAS for Explicit Fractional Packing and Covering Linear Programs , 2013, Algorithmica.

[30]  Ronitt Rubinfeld,et al.  Fractional Set Cover in the Streaming Model , 2017, APPROX-RANDOM.

[31]  Amit Chakrabarti,et al.  Incidence Geometries and the Pass Complexity of Semi-Streaming Set Cover , 2015, SODA.

[32]  Yuichi Yoshida,et al.  Improved Constant-Time Approximation Algorithms for Maximum Matchings and Other Optimization Problems , 2012, SIAM J. Comput..

[33]  Bernard Chazelle,et al.  Approximating the Minimum Spanning Tree Weight in Sublinear Time , 2001, ICALP.