Gene expression profiling: does it add predictive accuracy to clinical characteristics in cancer prognosis?

It is widely accepted that gene expression classifiers need to be externally validated by showing that they predict the outcome well enough on other patients than those from whose data the classifier was derived. Unfortunately, the gain in predictive accuracy by the classifier as compared to established clinical prognostic factors often is not quantified. Our objective is to illustrate the application of appropriate statistical measures for this purpose. In order to compare the predictive accuracies of a model based on the clinical factors only and of a model based on the clinical factors plus the gene classifier, we compute the decrease in predictive inaccuracy and the proportion of explained variation. These measures have been obtained for three studies of published gene classifiers: for survival of lymphoma patients, for survival of breast cancer patients and for the diagnosis of lymph node metastases in head and neck cancer. For the three studies our results indicate varying and possibly small added explained variation and predictive accuracy due to gene classifiers. Therefore, the gain of future gene classifiers should routinely be demonstrated by appropriate statistical measures, such as the ones we recommend.

[1]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[2]  D G Altman,et al.  What do we mean by validating a prognostic model? , 2000, Statistics in medicine.

[3]  Chaya S Moskowitz,et al.  Quantifying and comparing the accuracy of binary biomarkers when predicting a failure time outcome. , 2004, Statistics in medicine.

[4]  M. Pepe,et al.  Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. , 2004, American journal of epidemiology.

[5]  Michael W Kattan,et al.  Evaluating a New Marker’s Predictive Contribution , 2004, Clinical Cancer Research.

[6]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[7]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[8]  M. Radmacher,et al.  Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. , 2003, Journal of the National Cancer Institute.

[9]  Georg Heinze,et al.  Fixing the nonconvergence bug in logistic regression with SPLUS and SAS , 2003, Comput. Methods Programs Biomed..

[10]  M. Schemper,et al.  Predictive Accuracy and Explained Variation in Cox Regression , 2000, Biometrics.

[11]  M. Schemper,et al.  A solution to the problem of separation in logistic regression , 2002, Statistics in medicine.

[12]  J Stare,et al.  Explained variation in survival analysis. , 1996, Statistics in medicine.

[13]  Richard Simon,et al.  Roadmap for developing and validating therapeutically relevant genomic classifiers. , 2005, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[14]  Georg Heinze,et al.  Comparing the importance of prognostic factors in Cox and logistic regression using SAS , 2003, Comput. Methods Programs Biomed..

[15]  J. Ioannidis,et al.  Predictive ability of DNA microarrays for cancer outcomes and correlates: an empirical assessment , 2003, The Lancet.

[16]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[17]  John Quackenbush,et al.  A guide to microarray experiments-an open letter to the scientific journals , 2002, The Lancet.

[18]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[19]  Michael W Kattan,et al.  Judging new markers by their ability to improve predictive accuracy. , 2003, Journal of the National Cancer Institute.

[20]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[21]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[22]  L. Staudt,et al.  The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. , 2002, The New England journal of medicine.

[23]  Patrick Royston,et al.  A new measure of prognostic separation in survival data , 2004, Statistics in medicine.

[24]  Yudong D. He,et al.  A Gene-Expression Signature as a Predictor of Survival in Breast Cancer , 2002 .

[25]  David W. Hosmer,et al.  Applied Survival Analysis: Regression Modeling of Time-to-Event Data , 2008 .

[26]  P. Heagerty,et al.  Survival Model Predictive Accuracy and ROC Curves , 2005, Biometrics.

[27]  M. Schemper Predictive accuracy and explained variation , 2003, Statistics in medicine.

[28]  Philip Lijnzaad,et al.  An expression profile for diagnosis of lymph node metastases from primary head and neck squamous cell carcinomas , 2005, Nature Genetics.

[29]  Stefan Michiels,et al.  Prediction of cancer outcome with microarrays: a multiple random validation strategy , 2005, The Lancet.