Accuracy of Population Validity and Cross-Validity Estimation: An Empirical Comparison of Formula-Based, Traditional Empirical, and Equal Weights Procedures

An empirical monte carlo study was performed using predictor and criterion data from 84,808 U.S. Air Force enlistees. 501 samples were drawn for each of seven sample size conditions: 25, 40, 60, 80, 100, 150, and 200. Using an eight-predictor model, 500 estimates for each of 9 validity and 11 cross-validity estimation procedures were generated for each sample size condition. These estimates were then compared to the actual squared population validity and cross-validity in terms of mean bias and mean squared bias. For the regression models determined using ordinary least squares, the Ezekiel procedure produced the most accurate estimates of squared population validity (followed by the Smith and the Wherry procedures), and Burket’s formula resulted in the best estimates of squared population cross-validity. Other analyses compared the coefficients determined by traditional empirical cross-validation and equal weights; equal weights resulted in no loss of predictive accuracy and less shrinkage. Numerous issues for future basic research on validation and cross-validation are identified.

[1]  Neil J. Dorans,et al.  A note on cross-validating prediction equations , 1980 .

[2]  John B. Carroll,et al.  Phillip Justin Rulon (1900–1968) , 1969 .

[3]  M J Ree,et al.  Relationships of the Armed Services Vocational Aptitude Battery (ASVAB) Forms 8, 9, and 10 to Air Force Technical School Final Grades. , 1984 .

[4]  D. Krus,et al.  Computer Assisted Multicrossvalidation in Regression Analysis , 1982 .

[5]  R. B. Darlington Reduced-variance regression. , 1978, Psychological bulletin.

[6]  Nambury S. Raju,et al.  Methodology Review: Estimation of Population Validity and Cross-Validity, and the Use of Equal Weights in Prediction , 1997 .

[7]  Neal Schmitt,et al.  A Monte Carlo evaluation of three formula estimates of cross-validated multiple correlation. , 1977 .

[8]  R. Darlington Estimating the True Accuracy of Regression Predictions. , 1996 .

[9]  Jeffrey D. Kromrey,et al.  Use of Empirical Estimates of Shrinkage in Multiple Regression: A Caution , 1995 .

[10]  R. Klimoski,et al.  Estimating the validity of cross-validity estimation , 1986 .

[11]  Multicrossvalidation and the Jackknife in the Estimation of Shrinkage of the Multiple Coefficient of Correlation , 1985 .

[12]  George R. Burket,et al.  A study of reduced rank models for multiple prediction , 1943 .

[13]  R. Fowler Confidence intervals for the cross-validated multiple correlation in predictive regression models , 1986 .

[14]  John G. Claudy A Comparison of Five Variable Weighting Procedures , 1972 .

[15]  James E. Laughlin Comment on "Estimating coefficients in linear models: It don't make no nevermind." , 1978 .

[16]  Carl J. Huberty,et al.  Estimation in Multiple Correlation/Prediction , 1980 .

[17]  H. G. Osburn,et al.  Multiplicative validity generalization model: Accuracy of estimates as a function of sample size and mean, variance, and shape of distribution of true validities. , 1982 .

[18]  Neil J. Dorans,et al.  Estimators of the Squared Cross-Validity Coefficient: A Monte Carlo Investigation , 1979 .

[19]  P. Herzberg The Parameters of Cross-Validation , 1967 .

[20]  Michael W. Browne A COMPARISON OF SINGLE SAMPLE AND CROSS‐VALIDATION METHODS FOR ESTIMATING THE MEAN SQUARED ERROR OF PREDICTION IN MULTIPLE LINEAR REGRESSION , 1975 .

[21]  F. Drasgow,et al.  Alternative weighting schemes for linear prediction , 1978 .

[22]  Philippe Cattin,et al.  Estimation of the predictive power of a regression model. , 1980 .

[23]  Frank L. Schmidt,et al.  The Relative Efficiency of Regression and Simple Unit Predictor Weights in Applied Differential Psychology , 1971 .

[24]  E. Kennedy Estimation of the Squared Cross-Validity Coefficient in the Context of Best Subset Regression , 1988 .

[25]  John G. Claudy Multiple Regression and Validity Estimation in One Sample , 1978 .

[26]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[27]  Howard Wainer,et al.  Estimating Coefficients in Linear Models: It Don't Make No Nevermind , 1976 .

[28]  Philippe Cattin Note on the estimation of the squared cross-validated multiple correlation of a regression model. , 1980 .