An R-square coefficient based on final prediction error

Abstract In this paper, we propose an R 2 coefficient to measure the percentage of variance which can be explained by the fitted model (not by the true model) in a multiple linear regression problem. This new coefficient, denoted R FPE 2 , is a reparametrization of the FPE criterion in use for model selection, such that maximizing R FPE 2 over a set of candidate models is the same as minimizing FPE. Thus, R FPE 2 can be used simultaneously for assessing goodness of fit and for model selection. At each step of a model selection procedure, the user can then quantify what is gained/lost when adding/removing a variable in/from the model, which should facilitate practice and comprehension of model selection.

[1]  Yuhong Yang Can the Strengths of AIC and BIC Be Shared , 2005 .

[2]  R. Shibata Selection of the order of an autoregressive model by Akaike's information criterion , 1976 .

[3]  C. L. Mallows Some comments on C_p , 1973 .

[4]  P. Dhrymes ON THE GAME OF MAXIMIZING R‐2 , 1970 .

[5]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[6]  T. Hassard,et al.  Applied Linear Regression , 2005 .

[7]  G. Wahba Smoothing noisy data with spline functions , 1975 .

[8]  David R. Anderson,et al.  Multimodel Inference , 2004 .

[9]  R. Nishii Asymptotic Properties of Criteria for Selection of Variables in Multiple Regression , 1984 .

[10]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[11]  R. Shibata An optimal selection of regression variables , 1981 .

[12]  A. McQuarrie,et al.  Regression and Time Series Model Selection , 1998 .

[13]  D. Freedman A Note on Screening Regression Equations , 1983 .

[14]  Clifford M. Hurvich,et al.  Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion , 1998 .

[15]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[16]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[17]  N. Sugiura Further analysts of the data by akaike' s information criterion and the finite corrections , 1978 .

[18]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[19]  L Molinari,et al.  Neuromotor development from 5 to 18 years. Part 1: timed performance. , 2001, Developmental medicine and child neurology.

[20]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .