Criticality of predictors in multiple regression.

A new method is proposed for comparing all predictors in a multiple regression model. This method generates a measure of predictor criticality, which is distinct from and has several advantages over traditional indices of predictor importance. Using the bootstrapping (resampling with replacement) procedure, a large number of samples are obtained from a given data set which contains one response variable and p predictors. For each sample, all 2p-1 subset regression models are fitted and the best subset model is selected. Thus, the (multinomial) distribution of the probability that each of the 2p-1 subsets is 'the best' model for the data set is obtained. A predictor's criticality is defined as a function of the probabilities associated with the models that include the predictor. That is, a predictor which is included in a large number of probable models is critical to the identification of the best-fitting regression model and, therefore, to the prediction of the response variable. The procedure can be applied to fixed and random regression models and can use any measure of goodness of fit (e.g., adjusted R2, Cp, AIC) for identifying the best model. Several criticality measures can be defined by using different combinations of the probabilities of the best-fitting models, and asymptotic confidence intervals for each variable's criticality can be derived. The procedure is illustrated with several examples.

[1]  D. Budescu Dominance analysis: A new approach to the problem of relative importance of predictors in multiple regression. , 1993 .

[2]  Yoav Ganzach,et al.  Misleading Interaction and Curvilinear Terms , 1997 .

[3]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[4]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[5]  G. Judge,et al.  The Theory and Practice of Econometrics (2nd ed.). , 1986 .

[6]  J. Neter,et al.  Applied Linear Statistical Models (3rd ed.). , 1992 .

[7]  N. Draper,et al.  A Common Model Selection Criterion , 1987 .

[8]  E. Ziegel,et al.  Bootstrapping: A Nonparametric Approach to Statistical Inference , 1993 .

[9]  A. Wald Tests of statistical hypotheses concerning several parameters when the number of observations is large , 1943 .

[10]  W. Kruskal Relative Importance by Averaging Over Orderings , 1987 .

[11]  R. R. Hocking The analysis and selection of variables in linear regression , 1976 .

[12]  B. Everitt,et al.  Large sample standard errors of kappa and weighted kappa. , 1969 .

[13]  C. Mallows More comments on C p , 1995 .

[14]  J. Fleiss Statistical methods for rates and proportions , 1974 .

[15]  G. Judge,et al.  The Theory and Practice of Econometrics , 1981 .

[16]  C. L. Mallows Some comments on C_p , 1973 .

[17]  H. Chernoff LARGE-SAMPLE THEORY: PARAMETRIC CASE' , 1956 .

[18]  C. L. Mallows Some Comments onCp , 1973 .

[19]  P. Sen,et al.  Introduction to bivariate and multivariate analysis , 1981 .

[20]  L. Humphreys,et al.  Assessing spurious "moderator effects": Illustrated substantively with the hypothesized ("synergistic") relation between spatial and mathematical ability. , 1990, Psychological bulletin.

[21]  L. E. Jones,et al.  Analysis of multiplicative combination rules when the causal variables are measured with error. , 1983 .

[22]  Reinhard Viertl,et al.  Probability and Bayesian Statistics , 1987 .

[23]  S. Weisberg,et al.  Applied Linear Regression (2nd ed.). , 1986 .

[24]  J. Shao Bootstrap Model Selection , 1996 .

[25]  C. H. Oh,et al.  Some comments on , 1998 .

[26]  D. Freedman Bootstrapping Regression Models , 1981 .

[27]  Michael H. Kutner Applied Linear Statistical Models , 1974 .

[28]  H. Triandis,et al.  The Shifting Basis of Life Satisfaction Judgments Across Cultures: Emotions Versus Norms , 1998 .