Algorithms column: sublinear time algorithms

With the recent tremendous increase in computational power and cheap storage, we are blessed with a multitude of available, and possibly useful, information. It is always nice to have something for (almost) nothing. However, this blessing is also something of a curse, for we may also be asked to do something meaningful with all of this data. The scale of these data sets, coupled with the typical situation in which there is very little time to perform our computations, raises the question of which computations could one hope to accomplish extremely quickly? In particular, what can one solve in sublinear time? Sublinear time is a daunting goal since it allows one to read only a miniscule fraction of the input. Still, there are problems for which deterministic exact sublinear time algorithms are known. However, since any sublinear time algorithm can only view a small portion of the input, for most natural problems the algorithm must use randomization and must give an answer which is in some sense approximate. There is a growing body of work aimed at finding sublinear time algorithms for various problems. Recent results have shown that there are optimization problems whose values can be approximated in sublinear time. In addition, property testing, an alternative notion of approximation for decision problems, has been applied to give sublinear algorithms for a wide variety of problems. One can also test various properties of distributions, where access to the distribution is given through samples generated according to the distribution, in time sublinear in the size of the support of the distribution. Several useful techniques, including the use of the Szemerédi Regularity lemma

[1]  Alan M. Frieze,et al.  Quick Approximation to Matrices and Applications , 1999, Comb..

[2]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[3]  Rina Panigrahy,et al.  Better streaming algorithms for clustering problems , 2003, STOC '03.

[4]  Noga Alon,et al.  Random sampling and approximation of MAX-CSP problems , 2002, STOC '02.

[5]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[6]  Noga Alon,et al.  Testing of clustering , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[7]  E. Fischer THE ART OF UNINFORMED DECISIONS: A PRIMER TO PROPERTY TESTING , 2004 .

[8]  Eldar Fischer On the strength of comparisons in property testing , 2004, Inf. Comput..

[9]  Ronitt Rubinfeld,et al.  Spot-checkers , 1998, STOC '98.

[10]  Leonard Pitt,et al.  Sublinear time approximate clustering , 2001, SODA '01.

[11]  D. Pollard Convergence of stochastic processes , 1984 .

[12]  Ronitt Rubinfeld,et al.  Robust Characterizations of Polynomials with Applications to Program Testing , 1996, SIAM J. Comput..

[13]  Dana Ron,et al.  Testing Monotonicity , 2000, Comb..

[14]  Dana Ron,et al.  On Finding Large Conjunctive Clusters , 2003, COLT.

[15]  Dana Ron,et al.  Property testing and its connection to learning and approximation , 1998, JACM.

[16]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[17]  Dana Ron,et al.  Property Testing in Bounded Degree Graphs , 1997, STOC.

[18]  Luca Trevisan,et al.  Three Theorems regarding Testing Graph Properties , 2001, Electron. Colloquium Comput. Complex..

[19]  Dana Ron,et al.  Improved Testing Algorithms for Monotonicity , 1999, Electron. Colloquium Comput. Complex..

[20]  Marek Karpinski,et al.  Polynomial Time Approximation Schemes for Dense Instances of NP-Hard Problems , 1999, J. Comput. Syst. Sci..

[21]  Sudipto Guha,et al.  Clustering Data Streams , 2000, FOCS.

[22]  Eldar Fischer,et al.  A Review of Graph Grammars and Preview of ICGT 2002: The First International Conference on Graph Transformation. , 2001 .

[23]  Oded Goldreich,et al.  Combinatorial property testing (a survey) , 1997, Randomization Methods in Algorithm Design.

[24]  Ronitt Rubinfeld,et al.  Spot-Checkers , 2000, J. Comput. Syst. Sci..

[25]  Ronitt Rubinfeld,et al.  Testing random variables for independence and identity , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[26]  Ronitt Rubinfeld,et al.  Testing that distributions are close , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.