TUTORIAL IN BIOSTATISTICS MULTIVARIABLE PROGNOSTIC MODELS: ISSUES IN DEVELOPING MODELS, EVALUATING ASSUMPTIONS AND ADEQUACY, AND MEASURING AND REDUCING ERRORS

SUMMARY Multivariable regression models are powerful tools that are used frequently in studies of clinical outcomes. These models can use a mixture of categorical and continuous variables and can handle partially observed (censored) responses. However, uncritical application of modelling techniques can result in models that poorly fit the dataset at hand, or, even more likely, inaccurately predict outcomes on new subjects. One must know how to measure qualities of a model's fit in order to avoid poorly fitted or overfitted models. Measurement of predictive accuracy can be difficult for survival time data in the presence of censoring. We discuss an easily interpretable index of predictive discrimination as well as methods for assessing calibration of predicted survival probabilities. Both types of predictive accuracy should be unbiasedly validated using bootstrapping or cross-validation, before using predictions in a new data series. We discuss some of the hazards of poorly fitted and overfitted regression models and present one modelling strategy that avoids many of the problems discussed. The methods described are applicable to all regression models, but are particularly needed for binary, ordinal, and time-to-event outcomes. Methods are illustrated with a survival analysis in prostate cancer using Cox regression. Accurate estimation of patient prognosis is important for many reasons. First, prognostic estimates can be used to inform the patient about likely outcomes of her disease. Second, the physician can use estimates of prognosis as a guide for ordering additional tests and selecting appropriate therapies. Third, prognostic assessments are useful in the evaluation of technologies; prognostic estimates derived both with and without using the results of a given test can be compared to measure the incremental prognostic information provided by that test over what is provided by prior information.' Fourth, a researcher may want to estimate the effect of a single factor (for example, treatment given) on prognosis in an observational study in which many uncontrolled confounding factors are also measured. Here the simultaneous effects of the uncontrolled variables must be controlled (held constant mathematically if using a regression model) so that the effect of the factor of interest can be more purely estimated. An analysis of how variables (especially continuous ones) affect the patient outcomes of interest is necessary to

[1]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[2]  D. Collett Modelling Binary Data , 1991 .

[3]  D. Harrington,et al.  Regression Splines in the Cox Model with Application to Covariate Effects in Liver Disease , 1990 .

[4]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[5]  D J Spiegelhalter,et al.  Probabilistic prediction in patient management and clinical trials. , 1986, Statistics in medicine.

[6]  A. Pettitt,et al.  Investigating Time Dependence in Cox's Proportional Hazards Model , 1990 .

[7]  A. Atkinson A note on the generalized information criterion for choice of a model , 1980 .

[8]  J. Copas Regression, Prediction and Shrinkage , 1983 .

[9]  K Liu,et al.  A rank statistic for assessing the amount of variation explained by risk factors in epidemiologic studies. , 1979, American journal of epidemiology.

[10]  Clifford M. Hurvich,et al.  The impact of model selection on inference in linear regression , 1990 .

[11]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[12]  H. Keselman,et al.  Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables , 1992 .

[13]  P. J. Verweij,et al.  Penalized likelihood in Cox regression. , 1994, Statistics in medicine.

[14]  Jerald F. Lawless,et al.  Statistical Models and Methods for Lifetime Data , 1983 .

[15]  J. Friedman A VARIABLE SPAN SMOOTHER , 1984 .

[16]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[17]  B. Efron How Biased is the Apparent Error Rate of a Prediction Rule , 1986 .

[18]  Byar Dp,et al.  The choice of treatment for cancer patients based on covariate information. , 1980 .

[19]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[20]  M. Schemper,et al.  Analyses of associations with censored data by generalized Mantel and Breslow tests and generalized Kendall correlation coefficients , 1984 .

[21]  F. Harrell,et al.  Predicting outcome in coronary disease. Statistical models versus expert clinicians. , 1986, The American journal of medicine.

[22]  Neil H. Timm,et al.  The estimation of variance-covariance and correlation matrices from incomplete data , 1970 .

[23]  P. Grambsch,et al.  The effects of transformations and preliminary tests for non-linearity in regression. , 1991, Statistics in medicine.

[24]  D. Altman,et al.  Bootstrap investigation of the stability of a Cox regression model. , 1989, Statistics in medicine.

[25]  J. C. van Houwelingen,et al.  Predictive value of statistical models , 1990 .

[26]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[27]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[28]  David A. Schoenfeld,et al.  Partial residuals for the proportional hazards regression model , 1982 .

[29]  Robert Gray,et al.  Flexible Methods for Analyzing Survival Data Using Splines, with Applications to Breast Cancer Prognosis , 1992 .

[30]  Allan Donner,et al.  The Relative Effectiveness of Procedures Commonly Used in Multiple Regression Analysis for Dealing with Missing Values , 1982 .

[31]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[32]  K. Linnet,et al.  Assessing diagnostic tests by a strictly proper scoring rule. , 1989, Statistics in medicine.

[33]  L. Breiman The Little Bootstrap and other Methods for Dimensionality Selection in Regression: X-Fixed Prediction Error , 1992 .

[34]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[35]  R Simon,et al.  Measures of explained variation for survival data. , 1990, Statistics in medicine.

[36]  B. Efron,et al.  A Leisurely Look at the Bootstrap, the Jackknife, and , 1983 .

[37]  S. Crawford,et al.  A comparison of anlaytic methods for non-random missingness of outcome data. , 1995, Journal of clinical epidemiology.

[38]  M. Schemper Non-parametric analysis of treatment-covariate interaction in the presence of censoring. , 1988, Statistics in medicine.

[39]  P. Grambsch,et al.  Martingale-based residuals for survival models , 1990 .

[40]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[41]  R. Kay Treatment effects in competing-risks analysis of prostate cancer data. , 1986, Biometrics.

[42]  Ellen B. Roecker,et al.  Prediction error and its estimation for subset-selected models , 1991 .

[43]  David Collett Modelling Survival Data in Medical Research , 1994 .

[44]  N. Mantel Why Stepdown Procedures in Variable Selection , 1970 .

[45]  C J Fisher,et al.  The clinical evaluation of new drugs for sepsis. A prospective study design based on survival analysis. , 1993, JAMA.

[46]  J. Copas Cross-Validation Shrinkage of Regression Predictors , 1987 .

[47]  J. Edward Jackson,et al.  A User's Guide to Principal Components. , 1991 .

[48]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[49]  W G Henderson,et al.  Assessment of predictive models for binary outcomes: an empirical approach using operative death from cardiac surgery. , 1994, Statistics in medicine.

[50]  R. Simon,et al.  Flexible regression models with cubic splines. , 1989, Statistics in medicine.

[51]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[52]  D. Pregibon,et al.  Graphical Methods for Assessing Logistic Regression Models , 1984 .

[53]  P. Grambsch,et al.  Proportional hazards tests and diagnostics based on weighted residuals , 1994 .

[54]  F. Harrell,et al.  Regression models in clinical studies: determining relationships between predictors and response. , 1988, Journal of the National Cancer Institute.

[55]  N. Nagelkerke,et al.  A note on a general definition of the coefficient of determination , 1991 .

[56]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[57]  M Schumacher,et al.  A bootstrap resampling procedure for model building: application to the Cox regression model. , 1992, Statistics in medicine.

[58]  S. Cessie,et al.  Logistic Regression, a review , 1988 .