Prognostic/Clinical Prediction Models: Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors

Multivariable regression models are powerful tools that are used frequently in studies of clinical outcomes. These models can use a mixture of categorical and continuous variables and can handle partially observed (censored) responses. However, uncritical application of modelling techniques can result in models that poorly fit the dataset at hand, or, even more likely, inaccurately predict outcomes on new subjects. One must know how to measure qualities of a model's fit in order to avoid poorly fitted or overfitted models. Measurement of predictive accuracy can be difficult for survival time data in the presence of censoring. We discuss an easily interpretable index of predictive discrimination as well as methods for assessing calibration of predicted survival probabilities. Both types of predictive accuracy should be unbiasedly validated using bootstrapping or cross-validation, before using predictions in a new data series. We discuss some of the hazards of poorly fitted and overfitted regression models and present one modelling strategy that avoids many of the problems discussed. The methods described are applicable to all regression models, but are particularly needed for binary, ordinal, and time-to-event outcomes. Methods are illustrated with a survival analysis in prostate cancer using Cox regression.

[1]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[2]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[3]  S. F. Buck A Method of Estimation of Missing Values in Multivariate Data Suitable for Use with an Electronic Computer , 1960 .

[4]  N. Mantel Why Stepdown Procedures in Variable Selection , 1970 .

[5]  Neil H. Timm,et al.  The estimation of variance-covariance and correlation matrices from incomplete data , 1970 .

[6]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[7]  Myles Hollander,et al.  Nonparametric Tests of Independence for Censored Data with Application to Heart Transplant Studies , 1973 .

[8]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[9]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[10]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[11]  K Liu,et al.  A rank statistic for assessing the amount of variation explained by risk factors in epidemiologic studies. , 1979, American journal of epidemiology.

[12]  Byar Dp,et al.  The choice of treatment for cancer patients based on covariate information. , 1980 .

[13]  A. Atkinson A note on the generalized information criterion for choice of a model , 1980 .

[14]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[15]  Allan Donner,et al.  The Relative Effectiveness of Procedures Commonly Used in Multiple Regression Analysis for Dealing with Missing Values , 1982 .

[16]  David A. Schoenfeld,et al.  Partial residuals for the proportional hazards regression model , 1982 .

[17]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[18]  B. Efron,et al.  A Leisurely Look at the Bootstrap, the Jackknife, and , 1983 .

[19]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[20]  D. Pregibon,et al.  Graphical Methods for Assessing Logistic Regression Models , 1984 .

[21]  J. Friedman A VARIABLE SPAN SMOOTHER , 1984 .

[22]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[23]  M. Schemper,et al.  Analyses of associations with censored data by generalized Mantel and Breslow tests and generalized Kendall correlation coefficients , 1984 .

[24]  R. Kay Treatment effects in competing-risks analysis of prostate cancer data. , 1986, Biometrics.

[25]  B. Efron How Biased is the Apparent Error Rate of a Prediction Rule , 1986 .

[26]  F. Harrell,et al.  Predicting outcome in coronary disease. Statistical models versus expert clinicians. , 1986, The American journal of medicine.

[27]  J. Copas Cross-Validation Shrinkage of Regression Predictors , 1987 .

[28]  M. Schemper Non-parametric analysis of treatment-covariate interaction in the presence of censoring. , 1988, Statistics in medicine.

[29]  F. Harrell,et al.  Regression models in clinical studies: determining relationships between predictors and response. , 1988, Journal of the National Cancer Institute.

[30]  S. Cessie,et al.  Logistic Regression, a review , 1988 .

[31]  K. Linnet,et al.  Assessing diagnostic tests by a strictly proper scoring rule. , 1989, Statistics in medicine.

[32]  R. Simon,et al.  Flexible regression models with cubic splines. , 1989, Statistics in medicine.

[33]  D. Altman,et al.  Bootstrap investigation of the stability of a Cox regression model. , 1989, Statistics in medicine.

[34]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[35]  D. Harrington,et al.  Regression Splines in the Cox Model with Application to Covariate Effects in Liver Disease , 1990 .

[36]  P. Grambsch,et al.  Martingale-based residuals for survival models , 1990 .

[37]  Clifford M. Hurvich,et al.  The impact of model selection on inference in linear regression , 1990 .

[38]  R Simon,et al.  Measures of explained variation for survival data. , 1990, Statistics in medicine.

[39]  A. Pettitt,et al.  Investigating Time Dependence in Cox's Proportional Hazards Model , 1990 .

[40]  S. le Cessie,et al.  Predictive value of statistical models. , 1990, Statistics in medicine.

[41]  N. Nagelkerke,et al.  A note on a general definition of the coefficient of determination , 1991 .

[42]  Ellen B. Roecker,et al.  Prediction error and its estimation for subset-selected models , 1991 .

[43]  P. Grambsch,et al.  The effects of transformations and preliminary tests for non-linearity in regression. , 1991, Statistics in medicine.

[44]  M Schumacher,et al.  A bootstrap resampling procedure for model building: application to the Cox regression model. , 1992, Statistics in medicine.

[45]  Robert Gray,et al.  Flexible Methods for Analyzing Survival Data Using Splines, with Applications to Breast Cancer Prognosis , 1992 .

[46]  H. Keselman,et al.  Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables , 1992 .

[47]  L. Breiman The Little Bootstrap and other Methods for Dimensionality Selection in Regression: X-Fixed Prediction Error , 1992 .

[48]  D. Collett,et al.  Modelling Binary Data , 1991 .

[49]  J. Edward Jackson,et al.  A User's Guide to Principal Components. , 1991 .

[50]  C J Fisher,et al.  The clinical evaluation of new drugs for sepsis. A prospective study design based on survival analysis. , 1993, JAMA.

[51]  W G Henderson,et al.  Assessment of predictive models for binary outcomes: an empirical approach using operative death from cardiac surgery. , 1994, Statistics in medicine.

[52]  P. J. Verweij,et al.  Penalized likelihood in Cox regression. , 1994, Statistics in medicine.

[53]  D. Collet Modelling Survival Data in Medical Research , 2004 .

[54]  P. Grambsch,et al.  Proportional hazards tests and diagnostics based on weighted residuals , 1994 .

[55]  S. Crawford,et al.  A comparison of anlaytic methods for non-random missingness of outcome data. , 1995, Journal of clinical epidemiology.