A comparison of multivariable mathematical methods for predicting survival--I. Introduction, rationale, and general strategy.

This paper and the two following papers (Parts I-III) report an investigation of performance variability for four multivariable methods: discriminant function analysis, and linear, logistic, and Cox regression. Each method was examined for its performance in using the same independent variables to develop predictive models for survival of a large cohort of patients with lung cancer. The cogent biologic attributes of the patients had previously been divided into five ordinal stages having a strong prognostic gradient. With stratified random sampling, we prepared seven "generating" sets of data in which the five biologic stages were arranged in proportional, uniform, symmetrical unimodal, decreasing exponential, increasing exponential, U-shaped, or bi-modal distributions. Each of the multivariable methods was applied to each of the seven generating distributions, and the results were tested in a separate "challenge" set, which had not been included in any of the generating sets. The research was intended not merely to compare the performance of the multivariable methods, but also to see how their performance would be affected by different statistical distributions of the same cogent biologic attributes. The results, which are presented in the second and third papers, were compared for selection of independent variables and coefficients, and for accuracy in fitting the generating sets and the challenge set.

[1]  Alvan R. Feinstein,et al.  The epidemiology of cancer therapy. II. The clinical course: data, decisions, and temporal demarcations. , 1969, Archives of internal medicine.

[2]  F. Harrell,et al.  Regression models for prognostic prediction: advantages, problems, and suggested solutions. , 1985, Cancer treatment reports.

[3]  J A Koziol,et al.  Statistical approach to immunosuppression classification using lymphocyte surface markers and functional assays. , 1983, Cancer research.

[4]  C. Wells,et al.  Coding ordinal independent variables in multiple regression analyses. , 1987, American journal of epidemiology.

[5]  R. Olshen,et al.  Risk prediction after myocardial infarction. Comparison of three multivariate methodologies. , 1983, Cardiology.

[6]  A. Ciampi,et al.  Stratification by stepwise regression, correspondence analysis and recursive partition: A comparison of three methods of analysis for survival data with covaria , 1986 .

[7]  A R Feinstein,et al.  A reappraisal of staging and therapy for patients with cancer of the rectum. I. Development of two new systems of staging. , 1975, Archives of internal medicine.

[8]  A R Feinstein,et al.  The Epidemiology of Cancer Therapy: IV. The Extraction of Data From Medical Records , 1969 .

[9]  A. Feinstein Symptoms as an Index of Biological Behaviour and Prognosis in Human Cancer , 1966, Nature.

[10]  E F Cook,et al.  Empiric comparison of multivariate analytic techniques: advantages and disadvantages of recursive partitioning analysis. , 1984, Journal of chronic diseases.

[11]  J D Habbema,et al.  The performance of logistic discrimination on myocardial infarction data, in comparison with some other discriminant analysis methods. , 1983, Statistics in medicine.

[12]  A R Feinstein,et al.  The Epidemiology of Cancer Therapy: III. The Management of Imperfect Data , 1969 .

[13]  M. Halperin,et al.  Estimation of the multivariate logistic risk function: a comparison of the discriminant function and maximum likelihood approaches. , 1971, Journal of chronic diseases.

[14]  Alvan R. Feinstein,et al.  XIV. The purposes of prognostic stratification , 1972 .

[15]  E. Gilpin,et al.  Short-term prognosis in acute myocardial infarction: evaluation of different prediction methods. , 1984, American heart journal.

[16]  Alvan R. Feinstein,et al.  XVI. The process of prognostic stratification (Part 2) , 1972 .

[17]  E F Cook,et al.  Asymmetric stratification. An outline for an efficient method for controlling confounding in cohort studies. , 1988, American journal of epidemiology.

[18]  A R Feinstein,et al.  XVII. Synchronous partition and bivariate evaluation in predictive stratification , 1972, Clinical pharmacology and therapeutics.

[19]  E. Arnesen,et al.  Selecting risk factors: a comparison of discriminant analysis, logistic regression and Cox's regression model using data from the Tromsø Heart Study. , 1985, Statistics in medicine.

[20]  A. Feinstein,et al.  A new staging system for cancer, and a re-appraisal of "early" treatment and "cure" by radical surgery. , 1968, Transactions of the Association of American Physicians.

[21]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[22]  A R Feinstein,et al.  On classifying cancers while treating patients. , 1985, Archives of internal medicine.

[23]  P. Peduzzi,et al.  Comparison of the logistic and Cox regression models when outcome is determined in all patients after a fixed period of time. , 1987, Journal of chronic diseases.