Generalized cross-validation as a method for choosing a good ridge parameter

Consider the ridge estimate (λ) for β in the model unknown, (λ) = (X T X + nλI)−1 X T y. We study the method of generalized cross-validation (GCV) for choosing a good value for λ from the data. The estimate is the minimizer of V(λ) given by where A(λ) = X(X T X + nλI)−1 X T . This estimate is a rotation-invariant version of Allen's PRESS, or ordinary cross-validation. This estimate behaves like a risk improvement estimator, but does not require an estimate of σ2, so can be used when n − p is small, or even if p ≥ 2 n in certain cases. The GCV method can also be used in subset selection and singular value truncation methods for regression, and even to choose from among mixtures of these methods.

[1]  Richard Bellman,et al.  Introduction to Matrix Analysis , 1972 .

[2]  Alston S. Householder,et al.  The Theory of Matrices in Numerical Analysis , 1964 .

[3]  Kenneth Wright,et al.  Numerical solution of Fredholm integral equations of first kind , 1964, Comput. J..

[4]  G. Wahba On the Distribution of Some Statistics Useful in the Analysis of Jointly Stationary Time Series , 1968 .

[5]  Donald W. Marquaridt Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation , 1970 .

[6]  D. M. Allen Mean Square Error of Prediction as a Criterion for Selecting Variables , 1971 .

[7]  R. Hanson A Numerical Method for Solving Fredholm Integral Equations of the First Kind Using Singular Values , 1971 .

[8]  R. G. Krutchkoff,et al.  Empirical Bayes Estimation , 1972 .

[9]  J. Varah On the Numerical Solution of Ill-Conditioned Linear Systems with Applications to Ill-Posed Problems , 1973 .

[10]  P. Bloomfield,et al.  Numerical differentiation procedures for non-exact data , 1974 .

[11]  H. Akaike A new look at the statistical model identification , 1974 .

[12]  A. F. Smith,et al.  Ridge-Type Estimators for Regression Analysis , 1974 .

[13]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[14]  P. Bloomfield,et al.  A Time Series Approach To Numerical Differentiation , 1974 .

[15]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[16]  A. E. Hoerl,et al.  Ridge regression:some simulations , 1975 .

[17]  G. Wahba,et al.  A completely automatic french curve: fitting spline functions by cross validation , 1975 .

[18]  G. C. McDonald,et al.  A Monte Carlo Evaluation of Some Ridge-Type Estimators , 1975 .

[19]  G. Wahba,et al.  Periodic splines for spectral density estimation: the use of cross validation for determining the degree of smoothing , 1975 .

[20]  R. Farebrother The Minimum Mean Square Error Linear Estimator and Ridge Regression , 1975 .

[21]  R. Snee,et al.  Ridge Regression in Practice , 1975 .

[22]  Seymour Geisser,et al.  The Predictive Sample Reuse Method with Applications , 1975 .

[23]  W. Hemmerle An Explicit Solution for Generalized Ridge Regression , 1975 .

[24]  R. Obenchain Ridge Analysis Following a Preliminary Test of the Shrunken Hypothesis , 1975 .

[25]  J. Lawless,et al.  A simulation study of ridge and other regression estimators , 1976 .

[26]  J. N. Adichie Testing parallelism of regression lines against ordered alternatives , 1976 .

[27]  J. Hilgers Erratum: On the Equivalence of Regularization and Certain Reproducing Kernel Hilbert Space Approaches for Solving First Kind Problems , 1976 .

[28]  J. Rolph Choosing shrinkage estimators for regression problems , 1976 .

[29]  B. F. Swindel Good ridge estimators based on prior information , 1976 .

[30]  A. E. Hoerl,et al.  Ridge regression iterative estimation of the biasing parameter , 1976 .

[31]  J. Berger Minimax estimation of a multivariate normal mean under arbitrary quadratic loss , 1976 .

[32]  G. Wahba The approximate solution of linear operator equations when the data are noisy , 1976, Advances in Applied Probability.

[33]  M. Stone,et al.  Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[34]  M. Stone An Asymptotic Equivalence of Choice of Model by Cross‐Validation and Akaike's Criterion , 1977 .

[35]  E. Parzen Forecasting and Whitening Filter Estimation. , 1977 .

[36]  G. Wahba Practical Approximate Solutions to Linear Operator Equations When the Data are Noisy , 1977 .

[37]  N. Wermuth,et al.  A Simulation Study of Alternatives to Ordinary Least Squares , 1977 .

[38]  G. Wahba Optimal Smoothing of Density Estimates , 1977 .

[39]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[40]  D. Gibbons A Simulation Study of Some Ridge Estimators , 1981 .

[41]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.

[42]  Gene H. Golub,et al.  Singular value decomposition and least squares solutions , 1970, Milestones in Matrix Computation.

[43]  Gene H. Golub,et al.  Some modified matrix eigenvalue problems , 1973, Milestones in Matrix Computation.

[44]  D. Steinberg,et al.  Technometrics , 2008 .