Use of Empirical Estimates of Shrinkage in Multiple Regression: A Caution

Empirical techniques to estimate the shrinkage of the sample R2 have been advocated as alternatives to analytical formulae. Although such techniques may be appropriate for estimating the coefficient of cross-validation, they do not provide accurate estimates of the population multiple correlation. The accuracy of four empirical techniques (simple cross-validation, multi-cross-validation, jackknife, and bootstrap) were investigated in a Monte Carlo study. Random samples of size 20 to 200 were drawn from a pseudopopulation of actual field data. Regression models were investigated with population coefficients of determination ranging from .04 to .50 and with numbers of regressors ranging from 2 to 10. Substantial statistical bias was evident when the shrunken R2 values were used to estimate the population squared multiple correlation. Researchers are advised to avoid the empirical techniques when the parameter of interest is the population coefficient of determination rather than the coefficient of cross-validation.

[1]  David S. Carter Comparison of Different Shrinkage Formulas in Estimating Population Multiple Correlation Coefficients , 1979 .

[2]  C. I. Mosier I. Problems and Designs of Cross-Validation 1 , 1951 .

[3]  Ingram Olkin,et al.  Unbiased Estimation of Certain Correlation Coefficients , 1958 .

[4]  J. Elashoff,et al.  Multiple Regression in Behavioral Research. , 1975 .

[5]  Carl J. Huberty,et al.  Estimation in Multiple Correlation/Prediction , 1980 .

[6]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[7]  A. K. Kurtz A Research Test of the Rorschach Test , 1948 .

[8]  Philippe Cattin,et al.  Estimation of the predictive power of a regression model. , 1980 .

[9]  B. Tabachnick,et al.  Using Multivariate Statistics , 1983 .

[10]  P. Herzberg The Parameters of Cross-Validation , 1967 .

[11]  R. Bargmann,et al.  Multivariate Analysis (Techniques for Educational and Psychological Research) , 1989 .

[12]  Multicrossvalidation and the Jackknife in the Estimation of Shrinkage of the Multiple Coefficient of Correlation , 1985 .

[13]  Elazar J. Pedhazur,et al.  Measurement, Design, and Analysis: An Integrated Approach , 1994 .

[14]  R. Fisher Statistical Methods for Research Workers , 1971 .

[15]  Harold Gulliksen,et al.  Charles Jsaar Mosier 1910–1951 , 1951 .

[16]  R. Wherry,et al.  A New Formula for Predicting the Shrinkage of the Coefficient of Multiple Correlation , 1931 .

[17]  D. Krus,et al.  Computer Assisted Multicrossvalidation in Regression Analysis , 1982 .

[18]  P. Diaconis,et al.  Computer-Intensive Methods in Statistics , 1983 .

[19]  R. Klimoski,et al.  Estimating the validity of cross-validity estimation , 1986 .

[20]  Bruce Thompson,et al.  Bootstrap versus Statistical Effect Size Corrections: A Comparison with Data from the Finding Embedded Figures Test. , 1990 .