Bayesian Model Averaging in Proportional HazardModels : Assessing the Risk of a StrokeChris

In the context of the Cardiovascular Health Study, a comprehensive investigation into the risk factors for stroke, we apply Bayesian model averaging to the selection of variables in Cox proportional hazard models. We use an extension of the leaps and bounds algorithm for locating the models that are to be averaged over and make available S-PLUS software to implement the methods. Bayesian model averaging provides a posterior probability that each variable belongs in the model, a more directly interpretable measure of variable importance than a P-value. P-values from models preferred by stepwise methods tend to overstate the evidence for the predictive value of a variable and do not account for model uncertainty. We introduce the partial pre-dictive score to evaluate predictive performance. For the Cardiovascular Health Study, Bayesian model averaging predictively outperforms standard model selection and does a better job of assessing who is at high risk for stroke.

[1]  A. Raftery Approximate Bayes factors and accounting for model uncertainty in generalised linear models , 1996 .

[2]  D. Madigan,et al.  Bayesian model averaging and model selection for markov equivalence classes of acyclic digraphs , 1996 .

[3]  David Draper,et al.  Assessment and Propagation of Model Uncertainty , 2011 .

[4]  L. Wasserman,et al.  A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion , 1995 .

[5]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[6]  C. Chatfield Model uncertainty, data mining and statistical inference , 1995 .

[7]  D. Madigan,et al.  Eliciting prior information to enhance the predictive performance of Bayesian graphical models , 1995 .

[8]  Adrian E. Raftery,et al.  Accounting for Model Uncertainty in Survival Analysis Improves Predictive Performance , 1995 .

[9]  D. Madigan,et al.  Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window , 1994 .

[10]  A. Raftery,et al.  Analysis of Agricultural Field Trials in the Presence of Outliers and Fertility Jumps , 1994 .

[11]  R. Taplin Robust Likelihood Calculation for Time Series , 1993 .

[12]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[13]  H. Keselman,et al.  Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables , 1992 .

[14]  A. Atkinson Subset Selection in Regression , 1992 .

[15]  R. Regal,et al.  The effects of model selection on confidence intervals for the size of a closed population. , 1991, Statistics in medicine.

[16]  R. Kronmal,et al.  The Cardiovascular Health Study: design and rationale. , 1991, Annals of epidemiology.

[17]  D. Altman,et al.  Bootstrap investigation of the stability of a Cox regression model. , 1989, Statistics in medicine.

[18]  John Hinde,et al.  Statistical Modelling in GLIM. , 1989 .

[19]  J. Berger,et al.  Testing Precise Hypotheses , 1987 .

[20]  V. Flack,et al.  Frequency of Selecting Noise Variables in Subset Regression Analysis: A Simulation Study , 1987 .

[21]  A. Kuk All subsets regression in a proportional hazards model , 1984 .

[22]  D. Freedman A Note on Screening Regression Equations , 1983 .

[23]  J. Lawless,et al.  Efficient Screening of Nonnormal Regression Models , 1978 .

[24]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[25]  W. Kannel,et al.  A general cardiovascular risk profile: the Framingham Study. , 1976, The American journal of cardiology.

[26]  N. Breslow,et al.  Analysis of Survival Data under the Proportional Hazards Model , 1975 .

[27]  L. Kurland,et al.  Natural History of Stroke in Rochester, Minnesota, 1955 Through 1969: An Extension of a Previous Study, 1945 Through 1954 , 1973, Stroke.

[28]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[29]  M. Kendall,et al.  The discarding of variables in multivariate analysis. , 1967, Biometrika.

[30]  Ward Edwards,et al.  Bayesian statistical inference for psychological research. , 1963 .

[31]  S. Takagi,et al.  Natural History , 2019, Nature.