A three-gene model to robustly identify breast cancer molecular subtypes.

BACKGROUND Single sample predictors (SSPs) and Subtype classification models (SCMs) are gene expression-based classifiers used to identify the four primary molecular subtypes of breast cancer (basal-like, HER2-enriched, luminal A, and luminal B). SSPs use hierarchical clustering, followed by nearest centroid classification, based on large sets of tumor-intrinsic genes. SCMs use a mixture of Gaussian distributions based on sets of genes with expression specifically correlated with three key breast cancer genes (estrogen receptor [ER], HER2, and aurora kinase A [AURKA]). The aim of this study was to compare the robustness, classification concordance, and prognostic value of these classifiers with those of a simplified three-gene SCM in a large compendium of microarray datasets. METHODS Thirty-six publicly available breast cancer datasets (n = 5715) were subjected to molecular subtyping using five published classifiers (three SSPs and two SCMs) and SCMGENE, the new three-gene (ER, HER2, and AURKA) SCM. We used the prediction strength statistic to estimate robustness of the classification models, defined as the capacity of a classifier to assign the same tumors to the same subtypes independently of the dataset used to fit it. We used Cohen κ and Cramer V coefficients to assess concordance between the subtype classifiers and association with clinical variables, respectively. We used Kaplan-Meier survival curves and cross-validated partial likelihood to compare prognostic value of the resulting classifications. All statistical tests were two-sided. RESULTS SCMs were statistically significantly more robust than SSPs, with SCMGENE being the most robust because of its simplicity. SCMGENE was statistically significantly concordant with published SCMs (κ = 0.65-0.70) and SSPs (κ = 0.34-0.59), statistically significantly associated with ER (V = 0.64), HER2 (V = 0.52) status, and histological grade (V = 0.55), and yielded similar strong prognostic value. CONCLUSION Our results suggest that adequate classification of the major and clinically relevant molecular subtypes of breast cancer can be robustly achieved with quantitative measurements of three key genes.

[1]  F. Markowetz,et al.  The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups , 2012, Nature.

[2]  Mark T. W. Ebbert,et al.  Agreement in risk prediction between the 21-gene recurrence score assay (Oncotype DX®) and the PAM50 breast cancer intrinsic Classifier™ in early-stage estrogen receptor-positive breast cancer. , 2012, The oncologist.

[3]  Yuan Qi,et al.  Estrogen receptor (ER) mRNA and ER-related gene expression in breast cancers that are 1% to 10% ER-positive by immunohistochemistry. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[4]  J. Cuzick,et al.  Risk of recurrence and chemotherapy benefit for patients with node-negative, estrogen receptor-positive breast cancer: recurrence score alone and integrated with pathologic and clinical factors. , 2011, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[5]  Benjamin Haibe-Kains,et al.  DNA methylation profiling reveals a predominant immune component in breast cancers , 2011, EMBO molecular medicine.

[6]  Aedín C. Culhane,et al.  survcomp: an R/Bioconductor package for performance assessment and comparison of survival models , 2011, Bioinform..

[7]  Jorge S. Reis-Filho,et al.  Microarray-Based Class Discovery for Molecular Classification of Breast Cancer: Analysis of Interobserver Agreement , 2011, Journal of the National Cancer Institute.

[8]  Benjamin Haibe-Kains,et al.  Minimising Immunohistochemical False Negative ER Classification Using a Complementary 23 Gene Expression Signature of ER Status , 2010, PloS one.

[9]  Peter Regitnig,et al.  Genomic index of sensitivity to endocrine therapy for breast cancer. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[10]  Jason I. Herschkowitz,et al.  Phenotypic and molecular characterization of the claudin-low intrinsic subtype of breast cancer , 2010, Breast Cancer Research.

[11]  Simen Myhre,et al.  The importance of gene-centring microarray data. , 2010, The Lancet. Oncology.

[12]  Charles M Perou,et al.  Clinical implementation of the intrinsic subtypes of breast cancer. , 2010, The Lancet. Oncology.

[13]  A. Ashworth,et al.  Breast cancer molecular profiling with single sample predictors: a retrospective analysis. , 2010, The Lancet. Oncology.

[14]  Z. Szallasi,et al.  Efficacy of neoadjuvant Cisplatin in triple-negative breast cancer. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[15]  Aedín C. Culhane,et al.  GeneSigDB—a curated database of gene expression signatures , 2009, Nucleic Acids Res..

[16]  J. Wisell,et al.  Meta-analysis of gene expression profiles in breast cancer: toward a unified understanding of breast cancer subtyping and prognosis signatures , 2010 .

[17]  Stephen Fox,et al.  Subtypes of familial breast tumours revealed by expression and copy number profiling , 2010, Breast Cancer Research and Treatment.

[18]  A. Ashworth,et al.  An integrative genomic and transcriptomic analysis reveals molecular pathways and networks regulated by copy number aberrations in basal-like, HER2 and luminal cancers , 2010, Breast Cancer Research and Treatment.

[19]  Benjamin Haibe-Kains,et al.  A fuzzy gene expression-based computational approach improves breast cancer prognostication , 2010, Genome Biology.

[20]  Leming Shi,et al.  Effect of training-sample size and classification difficulty on the accuracy of genomic predictors , 2010, Breast Cancer Research.

[21]  W. Gerald,et al.  Genes that mediate breast cancer metastasis to the brain , 2009, Nature.

[22]  A. Nobel,et al.  Supervised risk predictor of breast cancer based on intrinsic subtypes. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[23]  Axel Benner,et al.  Effects of infiltrating lymphocytes and estrogen receptor on gene expression and prognosis in breast cancer , 2009, Breast Cancer Research and Treatment.

[24]  Tanja Cufer,et al.  The 76-gene signature defines high-risk patients that benefit from adjuvant tamoxifen therapy , 2009, Breast Cancer Research and Treatment.

[25]  K. Hess,et al.  Effect of molecular disease subsets on disease-free survival in randomized adjuvant chemotherapy trials for estrogen receptor-positive breast cancer. , 2008, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[26]  Gianluca Bontempi,et al.  Biological Processes Associated with Breast Cancer Clinical Outcome Depend on the Molecular Subtypes , 2008, Clinical Cancer Research.

[27]  H. Kölbl,et al.  The humoral immune system has a key prognostic impact in node-negative breast cancer. , 2008, Cancer research.

[28]  V. Kataja,et al.  Molecular Subtypes of Breast Cancers Detected in Mammography Screening and Outside of Screening , 2008, Clinical Cancer Research.

[29]  Hongmin Li,et al.  A Precisely Regulated Gene Expression Cassette Potently Modulates Metastasis and Survival in Multiple Solid Cancers , 2008, PLoS genetics.

[30]  Gianluca Bontempi,et al.  Predicting prognosis using molecular profiling in estrogen receptor-positive breast cancer treated with tamoxifen , 2008, BMC Genomics.

[31]  K. Hess,et al.  Response to neoadjuvant therapy and long-term survival in patients with triple-negative breast cancer. , 2008, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[32]  C. Sotiriou,et al.  Meta-analysis of gene expression profiles in breast cancer: toward a unified understanding of breast cancer subtyping and prognosis signatures , 2007, Breast Cancer Research.

[33]  Catherine Charbonnel,et al.  Prediction of metastatic relapse in node-positive breast cancer: establishment of a clinicogenomic model after FEC100 adjuvant regimen , 2008, Breast Cancer Research and Treatment.

[34]  Carsten Peterson,et al.  Gene expression profiling in primary breast cancer distinguishes patients developing local recurrence after breast-conservation surgery, with or without postoperative radiotherapy , 2008, Breast Cancer Research.

[35]  Xuesong Lu,et al.  Predicting features of breast cancer with gene expression patterns , 2008, Breast Cancer Research and Treatment.

[36]  Fatima Cardoso,et al.  The MINDACT trial: The first prospective clinical validation of a genomic tool , 2007, Molecular oncology.

[37]  J. Nevins,et al.  Validation of gene signatures that predict the response of breast cancer to neoadjuvant chemotherapy: a substudy of the EORTC 10994/BIG 00-01 clinical trial. , 2007, The Lancet. Oncology.

[38]  R. Arriagada,et al.  Breast cancer molecular subclassification and estrogen receptor expression to predict efficacy of adjuvant anthracyclines-based chemotherapy: a biomarker study from two randomized trials. , 2007, Annals of oncology : official journal of the European Society for Medical Oncology.

[39]  J. Bergh,et al.  Strong Time Dependence of the 76-Gene Prognostic Signature for Node-Negative Breast Cancer Patients in the TRANSBIG Multicenter Independent Validation Series , 2007, Clinical Cancer Research.

[40]  M. Ringnér,et al.  Poor prognosis in carcinoma is associated with a gene expression signature of aberrant PTEN tumor suppressor pathway activity , 2007, Proceedings of the National Academy of Sciences.

[41]  H. Ishwaran,et al.  Lung metastasis genes couple breast tumor size and metastatic spread , 2007, Proceedings of the National Academy of Sciences.

[42]  J. Bergh,et al.  Definition of clinically distinct molecular subtypes in estrogen receptor-positive breast carcinomas through genomic grade. , 2007, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[43]  Lajos Pusztai,et al.  Determination of oestrogen-receptor status and ERBB2 status of breast carcinoma: a gene-expression profiling study. , 2007, The Lancet. Oncology.

[44]  E Shelley Hwang,et al.  Identification of a robust gene signature that predicts breast cancer outcome in independent data sets , 2007, BMC Cancer.

[45]  Ramón Díaz-Uriarte,et al.  IDconverter and IDClight: Conversion and annotation of gene and protein IDs , 2007, BMC Bioinformatics.

[46]  Ajay N. Jain,et al.  Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. , 2006, Cancer cell.

[47]  L. Pusztai,et al.  Molecular classification of breast cancer: implications for selection of adjuvant chemotherapy , 2006, Nature Clinical Practice Oncology.

[48]  Amy V Kapp,et al.  Discovery and validation of breast cancer subtypes , 2006, BMC Genomics.

[49]  J. Ross,et al.  Pharmacogenomic predictor of sensitivity to preoperative chemotherapy with paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide in breast cancer. , 2006, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[50]  L. V. van't Veer,et al.  Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer. , 2006, Journal of the National Cancer Institute.

[51]  Lajos Pusztai,et al.  Molecular classification of breast cancer: limitations and potential. , 2006, The oncologist.

[52]  Hanlee P. Ji,et al.  The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. , 2006, Nature biotechnology.

[53]  A. Nobel,et al.  Concordance among Gene-Expression – Based Predictors for Breast Cancer , 2011 .

[54]  Yudi Pawitan,et al.  Intrinsic molecular signature of breast cancer in a population-based cohort of 412 patients , 2006, Breast Cancer Research.

[55]  A. Nobel,et al.  The molecular portraits of breast tumors are conserved across microarray platforms , 2006, BMC Genomics.

[56]  M. J. van de Vijver,et al.  Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. , 2006, Journal of the National Cancer Institute.

[57]  Shridar Ganesan,et al.  X chromosomal abnormalities in basal-like human breast cancer. , 2006, Cancer cell.

[58]  Jeffrey T. Chang,et al.  Oncogenic pathway signatures in human cancers as a guide to targeted therapies , 2006, Nature.

[59]  Maqc Consortium The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements , 2006, Nature Biotechnology.

[60]  L. Holmberg,et al.  Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts , 2005, Breast Cancer Research.

[61]  P. Hall,et al.  An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[62]  Robert Tibshirani,et al.  Cluster Validation by Prediction Strength , 2005 .

[63]  Andy J. Minn,et al.  Genes that mediate breast cancer metastasis to lung , 2005, Nature.

[64]  David Cameron,et al.  Identification of molecular apocrine breast tumours by microarray analysis , 2005, Oncogene.

[65]  J. Foekens,et al.  Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer , 2005, The Lancet.

[66]  M. Cronin,et al.  A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. , 2004, The New England journal of medicine.

[67]  Jane Fridlyand,et al.  Differentiation of lobular versus ductal breast carcinomas by expression microarray analysis. , 2003, Cancer research.

[68]  Philip M. Long,et al.  Breast cancer classification and prognosis based on gene expression profiles from a population-based study , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[69]  R. Tibshirani,et al.  Repeated observation of breast tumor subtypes in independent gene expression data sets , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[70]  M. West,et al.  Gene expression predictors of breast cancer outcomes , 2003, The Lancet.

[71]  J. Haerting,et al.  Gene-expression signatures in breast cancer. , 2003, The New England journal of medicine.

[72]  Yudong D. He,et al.  A Gene-Expression Signature as a Predictor of Survival in Breast Cancer , 2002 .

[73]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[74]  Van,et al.  A gene-expression signature as a predictor of survival in breast cancer. , 2002, The New England journal of medicine.

[75]  R. Tibshirani,et al.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[76]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[77]  P. J. Verweij,et al.  Cross-validation in survival analysis. , 1993, Statistics in medicine.

[78]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[79]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[80]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[81]  F. Wilcoxon,et al.  Individual comparisons of grouped data by ranking methods. , 1946, Journal of economic entomology.