Comparison of Predicted Probabilities of Proportional Hazards Regression and Linear Discriminant Analysis Methods Using a Colorectal Cancer Molecular Biomarker Database

Background: Although a majority of studies in cancer biomarker discovery claim to use proportional hazards regression (PHREG) to the study the ability of a biomarker to predict survival, few studies use the predicted probabilities obtained from the model to test the quality of the model. In this paper, we compared the quality of predictions by a PHREG model to that of a linear discriminant analysis (LDA) in both training and test set settings. Methods: The PHREG and LDA models were built on a 491 colorectal cancer (CRC) patient dataset comprised of demographic and clinicopathologic variables, and phenotypic expression of p53 and Bcl-2. Two variable selection methods, stepwise discriminant analysis and the backward selection, were used to identify the fi nal models. The endpoint of prediction in these models was fi ve-year post-surgery survival. We also used linear regression model to examine the effect of bin size in the training set on the accuracy of prediction in the test set. Results: The two variable selection techniques resulted in different models when stage was included in the list of variables available for selection. However, the proportion of survivors and non-survivors correctly identifi ed was identical in both of these models. When stage was excluded from the variable list, the error rate for the LDA model was 42% as compared to an error rate of 34% for the PHREG model. Conclusions: This study suggests that a PHREG model can perform as well or better than a traditional classifi er such as LDA to classify patients into prognostic classes. Also, this study suggests that in the absence of the tumor stage as a variable, Bcl-2 expression is a strong prognostic molecular marker of CRC.

[1]  W. Grizzle,et al.  Novel approaches to smoothing and comparing SELDI TOF spectra , 2007, Cancer Informatics.

[2]  H. Burke,et al.  Outcome prediction and the future of the TNM staging system. , 2004, Journal of the National Cancer Institute.

[3]  D. Ransohoff Rules of evidence for cancer molecular-marker discovery and validation , 2004, Nature Reviews Cancer.

[4]  H. Weiss,et al.  Prognostic Significance of p27kip-1 Expression in Colorectal Adenocarcinomas Is Associated with Tumor Stage , 2004, Clinical Cancer Research.

[5]  B. Gary,et al.  Altered subcellular localization of suppressin, a novel inhibitor of cell-cycle entry, is an independent prognostic factor in colorectal adenocarcinomas. , 2001, Clinical cancer research : an official journal of the American Association for Cancer Research.

[6]  H. Weiss,et al.  Racial differences in the prognostic usefulness of MUC1 and MUC2 in colorectal adenocarcinomas. , 2000, Clinical cancer research : an official journal of the American Association for Cancer Research.

[7]  U. Manne,et al.  Bcl‐2 expression is associated with improved prognosis in patients with distal colorectal adenocarcinomas , 2000, International journal of cancer.

[8]  M E Hammond,et al.  Prognostic factors in colorectal cancer. College of American Pathologists Consensus Statement 1999. , 2000, Archives of pathology & laboratory medicine.

[9]  U. Manne,et al.  Nuclear accumulation of p53 in colorectal adenocarcinoma , 1998, Cancer.

[10]  R. Poczatek,et al.  Prognostic significance of Bcl‐2 expression and p53 nuclear accumulation in colorectal adenocarcinoma , 1997, International journal of cancer.

[11]  R. Tibshirani,et al.  Statistical Applications in Genetics and Molecular Biology Pre-validation and inference in microarrays , 2011 .

[12]  U. Manne,et al.  Immunohistochemical Evaluation of Biomarkers in Prostatic and Colorectal Neoplasia , 1998 .

[13]  S. Hamilton,et al.  Potential false-positive results with antigen enhancement for immunohistochemistry of the p53 gene product in colorectal neoplasms. , 1996, The Journal of pathology.

[14]  R J Salmon,et al.  [Prognostic factors of colorectal cancer]. , 1989, Pathologie-biologie.