Quantifying the Relative Importance of Predictors in Multiple Linear Regression Analyses for Public Health Studies

Multiple linear regression analysis is widely used in many scientific fields, including public health, to evaluate how an outcome or response variable is related to a set of predictors. As a result, researchers often need to assess “relative importance” of a predictor by comparing the contributions made by other individual predictors in a particular regression model. Hence, development of valid statistical methods to estimate the relative importance of a set of predictors is of great interest. In this research, the authors considered the relative importance of a predictor when defined by that portion of the squared multiple correlation explained by the contribution of each predictor in the final model of interest. Here, a number of suggested relative importance indices motivated by this definition are reviewed, including the squared zero-order correlation, squared semipartial correlation, Product Measure (i.e., Pratt's Index), General Dominance Index, and Johnson's Relative Weight. The authors compared these indices using data sets from an occupational health study in which human inhalation exposure to styrene was measured and from a laboratory animal study on risk factors for atherosclerosis, and statistical properties using bootstrap methods were examined. The analysis suggests that the General Dominance Index and Johnson's Relative Weight are preferred methods for quantifying the relative importance of predictors in a multiple linear regression model. Johnson's Relative Weight involves significantly less computational burden than the General Dominance Index when the number of predictors in the final model is large.

[1]  D. Budescu Dominance analysis: A new approach to the problem of relative importance of predictors in multiple regression. , 1993 .

[2]  P. Hoffman,et al.  Assessment of the independent contributions of predictors. , 1962, Psychological bulletin.

[3]  Robert E. Ployhart,et al.  A Monte Carlo Comparison of Relative Importance Methodologies , 2004 .

[4]  P. Corey,et al.  Development of models to predict dose of pesticides in professional turf applicators , 2002, Journal of Exposure Analysis and Environmental Epidemiology.

[5]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[6]  P. Hoffman The paramorphic representation of clinical judgment. , 1960, Psychological bulletin.

[7]  D. Consonni,et al.  Monitoring Low Benzene Exposure: Comparative Evaluation of Urinary Biomarkers, Influence of Cigarette Smoking, and Genetic Polymorphisms , 2005, Cancer Epidemiology Biomarkers & Prevention.

[8]  J. Ring,et al.  Determinants of children's exposure to environmental tobacco smoke (ETS): A study in Southern Germany , 2004, Journal of Exposure Analysis and Environmental Epidemiology.

[9]  V. Skaggs,et al.  Lead sources, behaviors, and socioeconomic factors in relation to blood lead of native american and white children: a community-based assessment of a former mining area. , 2002, Environmental health perspectives.

[10]  P. Sen,et al.  Introduction to bivariate and multivariate analysis , 1981 .

[11]  W. Kruskal Relative Importance by Averaging Over Orderings , 1987 .

[12]  F. Jongeneelen,et al.  Estimation of individual dermal and respiratory uptake of polycyclic aromatic hydrocarbons in 12 coke oven workers. , 1993, British journal of industrial medicine.

[13]  T. Glass,et al.  Blood Lead Is a Predictor of Homocysteine Levels in a Population-Based Study of Older Adults , 2004, Environmental health perspectives.

[14]  P J Lioy,et al.  Seasonal influences on childhood lead exposure. , 2000, Environmental health perspectives.

[15]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[16]  Peter P. Egeghy,et al.  Dermal Exposure to Jet Fuel JP-8 Significantly Contributes to the Production of Urinary Naphthols in Fuel-Cell Maintenance Workers , 2005, Environmental health perspectives.

[17]  Richard Anderson-Sprecher,et al.  Model Comparisons and R 2 , 1994 .

[18]  Christopher H. Achen Interpreting and Using Regression , 1982 .

[19]  Evaluation of ambient air concentrations of polycyclic aromatic hydrocarbons in Germany from 1990 to 1998 , 2002, Journal of Exposure Analysis and Environmental Epidemiology.

[20]  R. M. Johnson The minimal transformation to orthonormality , 1966 .

[21]  J. Carroll,et al.  A New Measure of Predictor Variable Importance in Multiple Regression , 1978 .

[22]  P. Levallois,et al.  Evaluation of dermal and respiratory chloroform exposure in humans. , 1994, Environmental health perspectives.

[23]  A. Tobías,et al.  A combined analysis of the short-term effects of photochemical air pollutants on mortality within the EMECAM project. , 2002, Environmental health perspectives.

[24]  P. Boogaard,et al.  Urinary 1-hydroxypyrene as biomarker of exposure to polycyclic aromatic hydrocarbons in workers in petrochemical industries: baseline values and dermal uptake. , 1995, The Science of the total environment.

[25]  R. Darlington,et al.  Multiple regression in psychological research and practice. , 1968, Psychological bulletin.

[26]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[27]  R. Letz,et al.  Neurologic outcomes of toxic oil syndrome patients 18 years after the epidemic. , 2003, Environmental health perspectives.

[28]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[29]  M. J. Healy Measuring importance. , 1990, Statistics in medicine.

[30]  J. Bring How to Standardize Regression Coefficients , 1994 .

[31]  J. Sawyer,et al.  Dietary monounsaturated fatty acids promote aortic atherosclerosis in LDL receptor-null, human ApoB100-overexpressing transgenic mice. , 1998, Arteriosclerosis, thrombosis, and vascular biology.

[32]  James M. LeBreton,et al.  History and Use of Relative Importance Indices in Organizational Research , 2004 .

[33]  S M Rappaport,et al.  An investigation of factors contributing to styrene and styrene-7,8-oxide exposures in the reinforced-plastics industry. , 1999, The Annals of occupational hygiene.

[34]  C. Kuo,et al.  Increased Levels of 8-Hydroxy-2′-Deoxyguanosine Attributable to Carcinogenic Metal Exposure among Schoolchildren , 2005, Environmental health perspectives.

[35]  W. Kruskal,et al.  Concepts of Relative Importance in Recent Scientific Literature , 1989 .

[36]  J. H. Ward,et al.  Comments on "the paramorphic representation of clinical judgment". , 1962, Psychological bulletin.

[37]  J. Bring A Geometric Approach to Compare Variables in a Regression Model , 1996 .

[38]  S. Rappaport,et al.  Albumin Adducts of Naphthalene Metabolites as Biomarkers of Exposure to Polycyclic Aromatic Hydrocarbons , 2004, Cancer Epidemiology Biomarkers & Prevention.

[39]  H. Wichmann,et al.  Relationship between indoor and outdoor levels of fine particle mass, particle number concentrations and black smoke under different ventilation conditions , 2004, Journal of Exposure Analysis and Environmental Epidemiology.

[40]  J. W. Johnson A Heuristic Method for Estimating the Relative Weight of Predictor Variables in Multiple Regression , 2000, Multivariate behavioral research.

[41]  D. R. Thomas,et al.  On Variable Importance in Linear Regression , 1998 .

[42]  Roger Gibson,et al.  Dose-dependent production of urinary naphthols among workers exposed to jet fuel (JP-8). , 2004, American journal of industrial medicine.

[43]  Howard Hu,et al.  Determinants of Bone and Blood Lead Levels among Minorities Living in the Boston Area , 2004, Environmental health perspectives.

[44]  J. Schwartz,et al.  Effect of Breast Milk Lead on Infant Blood Lead Levels at 1 Month of Age , 2004, Environmental health perspectives.

[45]  D. Budescu,et al.  The dominance analysis approach for comparing predictors in multiple regression. , 2003, Psychological methods.

[46]  D. Dockery,et al.  3-Chloro-4-(dichloromethyl)-5-hydroxy-2(5H)-furanone (MX) and mutagenic activity in Massachusetts drinking water. , 2002, Environmental health perspectives.

[47]  W. Gibson Orthogonal Predictors: A Possible Resolution of the Hoffman-Ward Controversy , 1962 .

[48]  Paolo Brambilla,et al.  Relationship of serum TCDD concentrations and age at exposure of female residents of Seveso, Italy. , 2004, Environmental health perspectives.