Computational sample complexity

In a variety of PAC learning models, a trade-o between time and information seems to exist: with unlimited time, a small amount of information suces, but with time restrictions, more information sometimes seems to be required. In addition, it has long been known that there are concept classes that can be learned in the absence of computational restrictions, but (under standard cryptographic assumptions) cannot be learned in polynomial time (regardless of sample size). Yet, these results do not answer the question of whether there are classes for which learning from a small set of examples is computationally infeasible, but becomes feasible when the learner has access to (polynomially) more examples. To address this question, we introduce a new measure of learning complexity called computational sample complexity that represents the number of examples sucient for polynomial time learning with respect to a xed distribution. We then show concept classes that (under similar cryptographic assumptions) possess arbitrarily sized gaps between their standard (information-theoretic) sample complexity and their computational sample complexity. We also demonstrate such gaps for learning from membership queries and learning from noisy examples.

[1]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[2]  Jørn Justesen,et al.  Class of constructive asymptotically good algebraic codes , 1972, IEEE Trans. Inf. Theory.

[3]  A. D. Wyner,et al.  The wire-tap channel , 1975, The Bell System Technical Journal.

[4]  Imre Csiszár,et al.  Broadcast channels with confidential messages , 1978, IEEE Trans. Inf. Theory.

[5]  Temple F. Smith Occam's razor , 1980, Nature.

[6]  Andrew Chi-Chih Yao,et al.  Theory and application of trapdoor functions , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[7]  Manuel Blum,et al.  How to generate cryptographically strong sequences of pseudo random bits , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[8]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[9]  Silvio Micali,et al.  How to construct random functions , 1986, JACM.

[10]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[11]  P. Laird Learning from Good and Bad Data , 1988 .

[12]  Leslie G. Valiant,et al.  Computational limitations on learning from examples , 1988, JACM.

[13]  Leslie G. Valiant,et al.  A general lower bound on the number of examples needed for learning , 1988, COLT '88.

[14]  Johan Hstad,et al.  Construction of a pseudo-random generator from any one-way function , 1989 .

[15]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[16]  Ueli Maurer,et al.  Perfect cryptographic security from partially independent channels , 1991, STOC '91.

[17]  Dana Angluin,et al.  When won't membership queries help? , 1991, STOC '91.

[18]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[19]  Michael Kharitonov,et al.  Cryptographic hardness of distribution-specific learning , 1993, STOC.

[20]  Hans Ulrich Simon,et al.  General bounds on the number of examples needed for learning probabilistic concepts , 1993, COLT '93.

[21]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[22]  Leslie G. Valiant,et al.  Cryptographic limitations on learning Boolean formulae and finite automata , 1994, JACM.

[23]  M. Talagrand Sharper Bounds for Gaussian and Empirical Processes , 1994 .

[24]  U. Maurer,et al.  Generalized Privacy Ampliication , 1995 .

[25]  Moni Naor,et al.  Synthesizers and their application to the parallel construction of pseudo-random functions , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[26]  Ueli Maurer,et al.  Generalized privacy amplification , 1994, Proceedings of 1994 IEEE International Symposium on Information Theory.

[27]  Javed A. Aslam,et al.  On the Sample Complexity of Noise-Tolerant Learning , 1996, Inf. Process. Lett..

[28]  Moni Naor,et al.  Synthesizers and Their Application to the Parallel Construction of Pseudo-Random Functions , 1999, J. Comput. Syst. Sci..