Property testing and its connection to learning and approximation

In this paper, we consider the question of determining whether a function <italic>f</italic> has property P or is ε-far from any function with property P. A <italic>property testing</italic> algorithm is given a sample of the value of <italic>f</italic> on instances drawn according to some distribution. In some cases, it is also allowed to query <italic>f</italic> on instances of its choice. We study this question for different properties and establish some connections to problems in learning theory and approximation. In particular, we focus our attention on testing graph properties. Given access to a graph G in the form of being able to query whether an edge exists or not between a pair of vertices, we devise algorithms to test whether the underlying graph has properties such as being bipartite, <italic>k</italic>-Colorable, or having a <italic>p</italic>-Clique (clique of density <italic>p</italic> with respect to the vertex set). Our graph property testing algorithms are probabilistic and make assertions that are correct with high probability, while making a number of queries that is <italic>independent</italic> of the size of the graph. Moreover, the property testing algorithms can be used to efficiently (i.e., in time linear in the number of vertices) construct partitions of the graph that correspond to the property being tested, if it holds for the input graph.

[1]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[2]  W. Hoeffding,et al.  Distinguishability of Sets of Distributions , 1958 .

[3]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[4]  T. Cover On Determining the Irrationality of the Mean of a Random Variable , 1973 .

[5]  Arnold L. Rosenberg On the time required to recognize properties of graphs: a problem , 1973, SIGA.

[6]  E. Szemerédi Regular Partitions of Graphs , 1975 .

[7]  Ronald L. Rivest,et al.  On Recognizing Graph Properties from Adjacency Matrices , 1976, Theor. Comput. Sci..

[8]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[9]  DANA ANGLUIN,et al.  On the Complexity of Minimum Inference of Regular Sets , 1978, Inf. Control..

[10]  Jacob T. Schwartz,et al.  Fast Probabilistic Algorithms for Verification of Polynomial Identities , 1980, J. ACM.

[11]  L. Lovász,et al.  Geometric Algorithms and Combinatorial Optimization , 1981 .

[12]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[13]  David B. Shmoys,et al.  Using dual approximation algorithms for scheduling problems: Theoretical and practical results , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[14]  David B. Shmoys,et al.  A Polynomial Approximation Scheme for Machine Scheduling on Uniform Processors: Using the Dual Approximation Approach , 1986, FSTTCS.

[15]  Keith Edwards,et al.  The Complexity of Colouring Problems on Dense Graphs , 1986, Theor. Comput. Sci..

[16]  Andrew Chi-Chih Yao Lower bounds to randomized algorithms for graph properties , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[17]  Leslie G. Valiant,et al.  Computational limitations on learning from examples , 1988, JACM.

[18]  David B. Shmoys,et al.  A Polynomial Approximation Scheme for Scheduling on Uniform Processors: Using the Dual Approximation Approach , 1988, SIAM J. Comput..

[19]  J. G. Pierce,et al.  Geometric Algorithms and Combinatorial Optimization , 2016 .

[20]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[21]  Moni Naor,et al.  Small-bias probability spaces: efficient constructions and applications , 1990, STOC '90.

[22]  Noga Alon,et al.  Simple construction of almost k-wise independent random variables , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[23]  Manuel Blum,et al.  Self-testing/correcting with applications to numerical problems , 1990, STOC '90.

[24]  P. Hajnal An Ω(n4/3) lower bound on the randomized complexity of graph properties , 1991 .

[25]  Ronitt Rubinfeld,et al.  Self-testing/correcting for polynomials and for approximate functions , 1991, STOC '91.

[26]  László Lovász,et al.  Approximating clique is almost NP-complete , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[27]  P. Eter Hajnal An (n 4 3 ) Lower Bound on the Randomized Complexity of Graph Properties , 1991 .

[28]  Leonid A. Levin,et al.  Checking computations in polylogarithmic time , 1991, STOC '91.

[29]  Valerie King An Ω(n5/4) lower bound on the randomized complexity of graph properties , 1991, Comb..

[30]  Shai Ben-David,et al.  Can Finite Samples Detect Singularities of Real-Valued Functions? , 1992, STOC '92.

[31]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[32]  Noga Alon,et al.  Simple Construction of Almost k-wise Independent Random Variables , 1992, Random Struct. Algorithms.

[33]  Kenji Yamanishi Probably almost discriminative learning , 1992, COLT '92.

[34]  R. Schapire Toward Eecient Agnostic Learning , 1992 .

[35]  Carsten Lund,et al.  Efficient probabilistically checkable proofs and applications to approximations , 1993, STOC.

[36]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1993, JACM.

[37]  Sanjeev R. Kulkarni,et al.  On probably correct classification of concepts , 1993, COLT '93.

[38]  Vojtech Rödl,et al.  The Algorithmic Aspects of the Regularity Lemma , 1994, J. Algorithms.

[39]  Ronitt Rubinfeld,et al.  On the learnability of discrete distributions , 1994, STOC '94.

[40]  Mihir Bellare,et al.  Improved non-approximability results , 1994, STOC '94.

[41]  Carsten Lund,et al.  Efficient probabilistic checkable proofs and applications to approximation , 1994, STOC '94.

[42]  Marek Karpinski,et al.  Polynomial time approximation schemes for dense instances of NP-hard problems , 1995, STOC '95.

[43]  Mihir Bellare,et al.  Free bits, PCPs and non-approximability-towards tight results , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[44]  S. Kulkarni,et al.  A general classification rule for probability measures , 1995 .

[45]  Oded Goldreich,et al.  Foundations of Cryptography (Fragments of a Book) , 1995 .

[46]  Mihir Bellare,et al.  Linearity testing in characteristic two , 1996, IEEE Trans. Inf. Theory.

[47]  Johan Håstad Testing of the long code and hardness for clique , 1996, STOC '96.

[48]  Moni Naor,et al.  Adaptively secure multi-party computation , 1996, STOC '96.

[49]  Dana Ron,et al.  Property testing and its connection to learning and approximation , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[50]  Alan M. Frieze,et al.  The regularity lemma and approximation schemes for dense problems , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[51]  Ronitt Rubinfeld,et al.  Robust Characterizations of Polynomials with Applications to Program Testing , 1996, SIAM J. Comput..

[52]  Wenceslas Fernandez de la Vega,et al.  MAX-CUT has a randomized approximation scheme in dense graphs , 1996, Random Struct. Algorithms.

[53]  Marcos A. Kiwi,et al.  Probabilistically checkable proofs and the testing of hadamard-like codes , 1996 .

[54]  W. Vega,et al.  MAX-CUT has a randomized approximation scheme in dense graphs , 1996, Random Struct. Algorithms.

[55]  Dana Ron,et al.  Property Testing in Bounded Degree Graphs , 2002, STOC '97.

[56]  Johan Håstad,et al.  Some optimal inapproximability results , 1997, STOC '97.

[57]  Dana Ron,et al.  Property Testing in Bounded Degree Graphs , 1997, STOC.

[58]  Ronitt Rubinfeld,et al.  Spot-checkers , 1998, STOC '98.

[59]  David R. Karger,et al.  Approximate graph coloring by semidefinite programming , 1998, JACM.

[60]  Mihir Bellare,et al.  Free Bits, PCPs, and Nonapproximability-Towards Tight Results , 1998, SIAM J. Comput..

[61]  Dana Ron,et al.  Testing problems with sub-learning sample complexity , 1998, COLT' 98.

[62]  Luca Trevisan,et al.  Recycling queries in PCPs and in linearity tests (extended abstract) , 1998, STOC '98.

[63]  Dana Ron,et al.  A Sublinear Bipartiteness Tester for Bounded Degree Graphs , 1998, STOC '98.

[64]  J. Håstad Clique is hard to approximate withinn1−ε , 1999 .