论文信息 - An overview of statistical learning theory

An overview of statistical learning theory

Statistical learning theory was introduced in the late 1960's. Until the 1990's it was a purely theoretical analysis of the problem of function estimation from a given collection of data. In the middle of the 1990's new types of learning algorithms (called support vector machines) based on the developed theory were proposed. This made statistical learning theory not only a tool for the theoretical analysis but also a tool for creating practical algorithms for estimating multidimensional functions. This article presents a very general overview of statistical learning theory including both theoretical and algorithmic aspects of the theory. The goal of this overview is to demonstrate how the abstract learning theory established conditions for generalization which are more general than those discussed in classical statistical paradigms and how the understanding of these conditions inspired new algorithmic approaches to function estimation problems. A more detailed overview of the theory (without proofs) can be found in Vapnik (1995). In Vapnik (1998) one can find detailed description of the theory (including proofs).

Vladimir Vapnik | V. Vapnik

[1] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[2] V. Vapnik,et al. Necessary and Sufficient Conditions for the Uniform Convergence of Means to their Expectations , 1982 .

[3] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[4] Y. L. Cun. Learning Process in an Asymmetric Threshold Network , 1986 .

[5] Yann LeCun,et al. Learning processes in an asymmetric threshold network , 1986 .

[6] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[7] G. Wahba. Spline models for observational data , 1990 .

[8] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[9] Philip M. Long,et al. Fat-shattering and the learnability of real-valued functions , 1994, COLT '94.

[10] Tomaso A. Poggio,et al. Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[11] Christopher J. C. Burges,et al. Simplified Support Vector Decision Rules , 1996, ICML.