Stability and Generalization
暂无分享,去创建一个
[1] L. Goddard. Information Theory , 1962, Nature.
[2] 丸山 徹. Convex Analysisの二,三の進展について , 1977 .
[3] W. Rogers,et al. A Finite Sample Distribution-Free Performance Bound for Local Discrimination Rules , 1978 .
[4] Luc Devroye,et al. Distribution-free performance bounds for potential function rules , 1979, IEEE Trans. Inf. Theory.
[5] Luc Devroye,et al. Distribution-free inequalities for the deleted and holdout error estimates , 1979, IEEE Trans. Inf. Theory.
[6] J. Steele. An Efron-Stein inequality for nonsymmetric statistics , 1986 .
[7] Colin McDiarmid,et al. Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .
[8] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.
[9] T Poggio,et al. Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.
[10] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[11] L. Devroye. Exponential Inequalities in Nonparametric Estimation , 1991 .
[12] David Haussler,et al. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..
[13] Gábor Lugosi,et al. On the posterior-probability estimate of the error rate of nonparametric classification rules , 1993, IEEE Trans. Inf. Theory.
[14] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.
[15] Peter L. Bartlett,et al. For Valid Generalization the Size of the Weights is More Important than the Size of the Network , 1996, NIPS.
[16] L. Breiman. Heuristics of instability and stabilization in model selection , 1996 .
[17] M. Talagrand. A new look at independence , 1996 .
[18] Noga Alon,et al. Scale-sensitive dimensions, uniform convergence, and learnability , 1997, JACM.
[19] Dana Ron,et al. Algorithmic Stability and Sanity-Check Bounds for Leave-one-Out Cross-Validation , 1997, COLT.
[20] Peter L. Bartlett,et al. The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.
[21] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[22] Alexander Shapiro,et al. Optimization Problems with Perturbations: A Guided Tour , 1998, SIAM Rev..
[23] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .
[24] Tomaso Poggio,et al. A Unified Framework for Regularization Networks and Support Vector Machines , 1999 .
[25] Tommi S. Jaakkola,et al. Maximum Entropy Discrimination , 1999, NIPS.
[26] André Elisseeff,et al. Algorithmic Stability and Generalization Performance , 2000, NIPS.
[27] G. Wahba. An introduction to model building with repro-ducing kernel hilbert spaces , 2000 .