Extensions to Metric-Based Model Selection

Metric-based methods have recently been introduced for model selection and regularization, often yielding very significant improvements over the alternatives tried (including cross-validation). All these methods require unlabeled data over which to compare functions and detect gross differences in behavior away from the training points. We introduce three new extensions of the metric model selection methods and apply them to feature selection. The first extension takes advantage of the particular case of time-series data in which the task involves prediction with a horizon h. The idea is to use at t the h unlabeled examples that precede t for model selection. The second extension takes advantage of the different error distributions of cross-validation and the metric methods: cross-validation tends to have a larger variance and is unbiased. A hybrid combining the two model selection methods is rarely beaten by any of the two methods. The third extension deals with the case when unlabeled data is not available at all, using an estimated input density. Experiments are described to study these extensions in the context of capacity control and feature subset selection.

[1]  Yoshua Bengio,et al.  Model Selection for Small Sample Regression , 2002, Machine Learning.

[2]  Jason Weston,et al.  Vicinal Risk Minimization , 2000, NIPS.

[3]  W. Newey,et al.  A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelationconsistent Covariance Matrix , 1986 .

[4]  Dale Schuurmans A New Metric-Based Approach to Model Selection , 1997, AAAI/IAAI.

[5]  T Poggio,et al.  Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.

[6]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[7]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[8]  T. Andersen THE ECONOMETRICS OF FINANCIAL MARKETS , 1998, Econometric Theory.

[9]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[10]  Dale Schuurmans,et al.  Metric-Based Methods for Adaptive Model Selection and Regularization , 2002, Machine Learning.

[11]  F. Diebold,et al.  Comparing Predictive Accuracy , 1994, Business Cycles.

[12]  Dale Schuurmans,et al.  Guest Introduction: Special Issue on New Methods for Model Selection and Model Combination , 2002, Machine Learning.

[13]  Dean P. Foster,et al.  The risk inflation criterion for multiple regression , 1994 .

[14]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[15]  W. Newey,et al.  A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelationconsistent Covariance Matrix , 1986 .