论文信息 - Tutorial on Practical Prediction Theory for Classification

Tutorial on Practical Prediction Theory for Classification

We discuss basic prediction theory and its impact on classification success evaluation, implications for learning algorithm design, and uses in learning algorithm execution. This tutorial is meant to be a comprehensive compilation of results which are both theoretically rigorous and quantitatively useful.There are two important implications of the results presented here. The first is that common practices for reporting results in classification should change to use the test set bound. The second is that train set bounds can sometimes be used to directly motivate learning algorithms.

J. Langford

[1] E. S. Pearson,et al. THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIAL , 1934 .

[2] H. Chernoff. A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[3] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .

[4] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[5] Temple F. Smith. Occam's razor , 1980, Nature.

[6] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.

[7] Manfred K. Warmuth,et al. Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.

[8] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[9] John Shawe-Taylor,et al. Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.

[10] David A. McAllester. PAC-Bayesian model averaging , 1999, COLT '99.

[11] John Langford,et al. Beating the hold-out: bounds for K-fold and progressive cross-validation , 1999, COLT '99.