论文信息 - Near-Optimal Bounds for Cross-Validation via Loss Stability

Near-Optimal Bounds for Cross-Validation via Loss Stability

Multi-fold cross-validation is an established practice to estimate the error rate of a learning algorithm. Quantifying the variance reduction gains due to cross-validation has been challenging due to the inherent correlations introduced by the folds. In this work we introduce a new and weak measure called loss stability and relate the cross-validation performance to this measure; we also establish that this relationship is near-optimal. Our work thus quantitatively improves the current best bounds on cross-validation.

[1] Andrew Y. Ng,et al. Preventing "Overfitting" of Cross-Validation Data , 1997, ICML.

[2] Ron Kohavi,et al. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[3] Hendrik Blockeel,et al. Efficient Algorithms for Decision Tree Cross-validation , 2001, J. Mach. Learn. Res..

[4] Michael Kearns,et al. A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split , 1995, Neural Computation.

[5] W. Rogers,et al. A Finite Sample Distribution-Free Performance Bound for Local Discrimination Rules , 1978 .

[6] John Langford,et al. Beating the hold-out: bounds for K-fold and progressive cross-validation , 1999, COLT '99.

[7] Saharon Rosset. Bi-level path following for cross validated solution of kernel quantile regression , 2008, ICML '08.

[8] O. Gascuel,et al. Distribution-free performance bounds with the , 1992 .

[9] Partha Niyogi,et al. Almost-everywhere Algorithmic Stability and Generalization Error , 2002, UAI.

[10] Dana Ron,et al. Algorithmic Stability and Sanity-Check Bounds for Leave-One-Out Cross-Validation , 1997, Neural Computation.

[11] Sergei Vassilvitskii,et al. Cross-Validation and Mean-Square Stability , 2011, ICS.

[12] Dana Ron,et al. Algorithmic Stability and Sanity-Check Bounds for Leave-one-Out Cross-Validation , 1997, COLT.

[13] Luc Devroye,et al. Distribution-free performance bounds for potential function rules , 1979, IEEE Trans. Inf. Theory.

[14] Andrew W. Moore,et al. Efficient Algorithms for Minimizing Cross Validation Error , 1994, ICML.

[15] Rahul Sukthankar,et al. Complete Cross-Validation for Nearest Neighbor Classifiers , 2000, ICML.

[16] Martin Anthony,et al. London WC1E6BT , 2007 .

[17] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..

[18] Yoshua Bengio,et al. No Unbiased Estimator of the Variance of K-Fold Cross-Validation , 2003, J. Mach. Learn. Res..