Guided Regression Modeling for Prediction and Exploration of Structure With Many Explanatory Variables

A modeling procedure for multiple linear regression is proposed. This procedure begins with preliminary interior and global analyses. The global analysis is based on a form of canonical analysis of the sample correlation matrix of all variables, and, depending on the regression objective, the procedure uses information from that analysis as a guide in the selection of methods to achieve the objective. The two objectives discussed are prediction and exploration of structure. The dependence of the choice of methods on the regression objective is illustrated on a “benchmark” data set, and the results obtained by our approach are compared with published results obtained by other methods. The procedure suggested is particularly useful for data sets with large numbers of explanatory variables that render more conventional methods more expensive, less flexible, or less informative concerning relationships among variables.

[1]  Donald W. Marquaridt Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation , 1970 .

[2]  J. W. Gorman,et al.  Fitting Equations to Data. , 1973 .

[3]  Douglas M. Hawkins,et al.  On the Investigation of Alternative Regressions by Principal Component Analysis , 1973 .

[4]  G. M. Furnival All Possible Regressions with Less Computation , 1971 .

[5]  Norman R. Draper,et al.  Ridge Regression and James-Stein Estimation: Review and Comments , 1979 .

[6]  M. J. R. Healy,et al.  Fitting Equations to Data, 2Nd Ed , 1980 .

[7]  Stanley L. Sclove,et al.  Improved Estimators for Coefficients in Linear Regression , 1968 .

[8]  R. Welsch,et al.  The Hat Matrix in Regression and ANOVA , 1978 .

[9]  R. R. Hocking,et al.  Selection of the Best Subset in Regression Analysis , 1967 .

[10]  R. R. Hocking The analysis and selection of variables in linear regression , 1976 .

[11]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[12]  J. T. Webster,et al.  Latent Root Regression Analysis , 1974 .

[13]  Gary Smith,et al.  A Critique of Some Ridge Regression Methods , 1980 .

[14]  R. Snee,et al.  Ridge Regression in Practice , 1975 .

[15]  R. R. Hocking,et al.  Computational Efficieucy in the Selection of Regression Variables , 1970 .

[16]  J. W. Gorman,et al.  Selection of Variables for Fitting Equations to Data , 1966 .

[17]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .