The Probably Approximately Correct (PAC) and Other Learning Models

This paper surveys some recent theoretical results on the efficiency of machine learning algorithms. The main tool described is the notion of Probably Approximately Correct (PAC) learning, introduced by Valiant. We define this learning model and then look at some of the results obtained in it. We then consider some criticisms of the PAC model and the extensions proposed to address these criticisms. Finally, we look briefly at other models recently proposed in computational learning theory.

[1]  D. Angluin Queries and Concept Learning , 1988 .

[2]  Philip M. Long,et al.  Tracking drifting concepts using random examples , 1991, Annual Conference Computational Learning Theory.

[3]  N. Littlestone Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .

[4]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[5]  Michael Frazier,et al.  Learning conjunctions of Horn clauses , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[6]  Ming Li,et al.  Learning in the presence of malicious errors , 1993, STOC '88.

[7]  Nick Littlestone,et al.  From on-line to batch learning , 1989, COLT '89.

[8]  David Haussler,et al.  Calculation of the learning curve of Bayes optimal classification algorithm for learning a perceptron with noise , 1991, COLT '91.

[9]  Philip M. Long,et al.  On-line learning of linear functions , 1991, STOC '91.

[10]  Silvio Micali,et al.  How to construct random functions , 1986, JACM.

[11]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[12]  Dana Angluin,et al.  When won't membership queries help? , 1991, STOC '91.

[13]  David Haussler,et al.  Quantifying Inductive Bias: AI Learning Algorithms and Valiant's Learning Framework , 1988, Artif. Intell..

[14]  David Haussler,et al.  Equivalence of models for polynomial learnability , 1988, COLT '88.

[15]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[16]  David Haussler,et al.  Occam's Razor , 1987, Inf. Process. Lett..

[17]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[18]  Manfred K. Warmuth,et al.  Learning Nested Differences of Intersection-Closed Concept Classes , 1989, COLT '89.

[19]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[20]  Marek Karpinski,et al.  Learning read-once formulas with queries , 1993, JACM.

[21]  David A. Cohn,et al.  Can Neural Networks Do Better Than the Vapnik-Chervonenkis Bounds? , 1990, NIPS.

[22]  Marek Karpinski,et al.  Learning read-once formulas using membership queries , 1989, COLT '89.

[23]  Leonid A. Levin,et al.  One way functions and pseudorandom generators , 1987, Comb..

[24]  Leslie G. Valiant,et al.  On the learnability of Boolean formulae , 1987, STOC.

[25]  Eric B. Baum,et al.  When Are k-Nearest Neighbor and Back Propagation Accurate for Feasible Sized Sets of Examples? , 1990, EURASIP Workshop.

[26]  Michael J. Pazzani,et al.  Average case analysis of empirical and explanation-based learning algorithms , 1989 .

[27]  Leonard Pitt,et al.  Prediction-Preserving Reducibility , 1990, J. Comput. Syst. Sci..

[28]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[29]  John Shawe-Taylor,et al.  Bounding Sample Size with the Vapnik-Chervonenkis Dimension , 1993, Discrete Applied Mathematics.

[30]  Robert E. Schapire,et al.  Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[31]  Alon Itai,et al.  Learnability by fixed distributions , 1988, COLT '88.

[32]  Leslie G. Valiant,et al.  Computational limitations on learning from examples , 1988, JACM.

[33]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[34]  Leonard Pitt,et al.  Inductive Inference, DFAs, and Computational Complexity , 1989, AII.

[35]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[36]  Robert O. Winder,et al.  Enumeration of Seven-Argument Threshold Functions , 1965, IEEE Trans. Electron. Comput..

[37]  Jonathan Amsterdam Extending the Valiant Learning Model , 1988, ML.