Fast Cross-Validation for Incremental Learning

Cross-validation (CV) is one of the main tools for performance estimation and parameter tuning in machine learning. The general recipe for computing CV estimate is to run a learning algorithm separately for each CV fold, a computationally expensive process. In this paper, we propose a new approach to reduce the computational burden of CV-based performance estimation. As opposed to all previous attempts, which are specific to a particular learning model or problem domain, we propose a general method applicable to a large class of incremental learning algorithms, which are uniquely fitted to big data problems. In particular, our method applies to a wide range of supervised and unsupervised learning tasks with different performance criteria, as long as the base learning algorithm is incremental. We show that the running time of the algorithm scales logarithmically, rather than linearly, in the number of CV folds. Furthermore, the algorithm has favorable properties for parallel and distributed implementation. Experiments with state-of-the-art incremental learning algorithms confirm the practicality of the proposed method.

[1]  A. Girard A fast ‘Monte-Carlo cross-validation’ procedure for large least squares problems with noisy data , 1989 .

[2]  G. Wahba Spline models for observational data , 1990 .

[3]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[4]  G. Golub,et al.  Generalized cross-validation for large scale problems , 1997 .

[5]  Rahul Sukthankar,et al.  Complete Cross-Validation for Nearest Neighbor Classifiers , 2000, ICML.

[6]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[7]  Peyman Milanfar,et al.  Efficient generalized cross-validation with applications to parametric image restoration and resolution enhancement , 2001, IEEE Trans. Image Process..

[8]  Claudio Gentile,et al.  On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.

[9]  Tapio Salakoski,et al.  Fast n-Fold Cross-Validation for Regularized Least-Squares , 2006 .

[10]  Gavin C. Cawley,et al.  Leave-One-Out Cross-Validation Based Model Selection Criteria for Weighted LS-SVMs , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[11]  Gene H. Golub,et al.  Generalized cross-validation as a method for choosing a good ridge parameter , 1979, Milestones in Matrix Computation.

[12]  Senjian An,et al.  Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression , 2007, Pattern Recognit..

[13]  Andreas Christmann,et al.  Bouligand Derivatives and Robustness of Support Vector Machines for Regression , 2007, J. Mach. Learn. Res..

[14]  Ambuj Tewari,et al.  On the Generalization Ability of Online Strongly Convex Programming Algorithms , 2008, NIPS.

[15]  M. Debruyne,et al.  Model Selection in Kernel Based Regression using the Influence Function , 2008 .

[16]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[17]  David P. Woodruff,et al.  Sublinear Optimization for Machine Learning , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[20]  Michael Izbicki,et al.  Algebraic classifiers: a generic approach to fast cross-validation, online training, and parallel training , 2013, ICML.

[21]  Marius Kloft,et al.  Toward Supervised Anomaly Detection , 2014, J. Artif. Intell. Res..

[22]  Yong Liu,et al.  Efficient Approximation of Cross-Validation for Kernel Methods using Bouligand Influence Function , 2014, ICML.