Length modified ridge regression

Abstract Biased regression methods may improve considerably on ordinary least squares regression with few or noisy data, or when the predictor variables are highly collinear. In the present work, I present a new, biased method that modifies the ordinary least squares estimate by adjusting each element of the estimated coefficient vector. The adjusting factors are found by minimizing a measure of prediction error. However, the optimal adjusting factors depend on the unknown coefficient vector as well as the variance of the noise, so in practice these are replaced by preliminary estimates. The final estimate of the coefficient vector has the same direction as the preliminary estimate, but the length is modified. Ridge regression is used as the principal method to find the preliminary estimate, and the method is called length modified ridge regression. In addition, length modified principal components regression is considered. The prediction performance of the methods are compared to other regression methods (ridge, James-Stein, partial least squares, principal components and variable subset selection) in a simulation study. Of all methods considered, length modified ridge regression shows the overall best behaviour. The improvement over ridge regression is moderate, but significant, especially when the data are few and noisy.

[1]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[2]  L. Breiman,et al.  Submodel selection and evaluation in regression. The X-random case , 1992 .

[3]  T. Hassard,et al.  Applied Linear Regression , 2005 .

[4]  M. Stone Continuum regression: Cross-validated sequentially constructed prediction embracing ordinary least s , 1990 .

[5]  H. Theil Principles of econometrics , 1971 .

[6]  B. Hosmane On a generalized stein estimator of regression coefficients , 1988 .

[7]  G. C. McDonald,et al.  Instabilities of Regression Estimates Relating Air Pollution to Mortality , 1973 .

[8]  ScienceDirect Computational statistics & data analysis , 1983 .

[9]  J. Friedman,et al.  Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .

[10]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[11]  T. Fearn A Misuse of Ridge Regression in the Calibration of a Near Infrared Reflectance Instrument , 1983 .

[12]  Arthur E. Hoerl,et al.  Practical Use of Ridge Regression: A Challenge Met , 1985 .

[13]  R. Sundberg Continuum Regression and Ridge Regression , 1993 .

[14]  W. Massy Principal Components Regression in Exploratory Statistical Research , 1965 .

[15]  BLSS, the Berkeley Interactive Statistical System , 1988 .

[16]  J. Gani,et al.  Perspectives in Probability and Statistics. , 1980 .

[17]  J. W. Gorman,et al.  Selection of Variables for Fitting Equations to Data , 1966 .

[18]  David J. Hand,et al.  A Handbook of Small Data Sets , 1993 .

[19]  H. Wold Soft Modelling by Latent Variables: The Non-Linear Iterative Partial Least Squares (NIPALS) Approach , 1975, Journal of Applied Probability.