Predicting Triple-Negative Breast Cancer Subtype Using Multiple Single Nucleotide Polymorphisms for Breast Cancer Risk and Several Variable Selection Methods

Abstract Introduction Studies of triple-negative breast cancer have recently been extending the inclusion criteria and incorporating additional molecular markers into the selection criteria, opening up scope for targeted therapies. The screening phases required for studies of this type are often prolonged, since the process of determining the molecular subtype and carrying out additional biomarker assessment is time-consuming. Parameters such as germline genotypes capable of predicting the molecular subtype before it becomes available from pathology might be helpful for treatment planning and optimizing the timing and cost of screening phases. This appears to be feasible, as rapid and low-cost genotyping methods are becoming increasingly available. The aim of this study was to identify single nucleotide polymorphisms (SNPs) for breast cancer risk capable of predicting triple negativity, in addition to clinical predictors, in breast cancer patients. Methods This cross-sectional observational study included 1271 women with invasive breast cancer who were treated at a university hospital. A total of 76 validated breast cancer risk SNPs were successfully genotyped. Univariate associations between each SNP and triple negativity were explored using logistic regression analyses. Several variable selection and regression techniques were applied to identify a set of SNPs that together improve the prediction of triple negativity in addition to the clinical predictors of age at diagnosis and body mass index (BMI). The most accurate prediction method was determined by cross-validation. Results The SNP rs10069690 (TERT, CLPTM1L) was the only significant SNP (corrected p = 0.02) after correction of p values for multiple testing in the univariate analyses. This SNP and three additional SNPs from the genes RAD51B, CCND1, and FGFR2 were selected for prediction of triple negativity. The addition of these SNPs to clinical predictors increased the cross-validated area under the curve (AUC) from 0.618 to 0.625. Age at diagnosis was the strongest predictor, stronger than any genetic characteristics. Conclusion Prediction of triple-negative breast cancer can be improved if SNPs associated with breast cancer risk are added to a prediction rule based on age at diagnosis and BMI. This finding could be used for prescreening purposes in complex molecular therapy studies for triple-negative breast cancer. Zusammenfassung Einleitung Studien bei triple-negativem Brustkrebs haben die Einschlusskriterien durch die Aufnahme zusätzlicher molekularer Marker erweitert. Im Rahmen des Screenings für diese Therapiestudien wird sowohl für die Bestimmung des molekularen Subtyps als als auch für zusätzliche Biomarker-Untersuchungen ein längerer Zeitraum beansprucht, was die Behandlung verzögert. Keimbahn-Genotypen könnten bei der Vorhersage des molekularen Subtyps helfen, zumal schnelle und günstige Genotypisierungsmethoden zunehmend zur Verfügung stehen. Ziel dieser Studie war es deswegen, zu prüfen, ob Einzelnukleotid-Polymorphismen (SNPs) der Keimbahn dabei helfen können, Brustkrebspatientinnen mit triple-negativem Mammakarzinom zu identifizieren. Methoden In dieser Querschnittsstudie wurden 1271 Patientinnen mit invasivem Mammakarzinom eingeschlossen. Insgesamt wurden 76 validierte Brustkrebsrisiko-SNPs erfolgreich genotypisiert. Univariate Assoziationen zwischen jedem SNP und Triple-Negativität wurden mittels logistischer Regression geprüft. Verschiedene Variablenselektions- und Regressionsmethoden wurden angewandt, um eine Gruppe von SNPs zu identifizieren, die zusammen mit den klinischen Prädiktoren Alter bei Diagnose und BMI die Prädiktion der Triple-Negativität verbessern. Mittels Kreuzvalidierung wurde die Methode mit der höchsten Genauigkeit bestimmt. Ergebnisse Der SNP rs10069690 (TERT, CLPTM1L) war der einzige einzelne SNP, der nach p-Wert-Korrektur für multiples Testen signifikant mit Triple-Negativität assoziiert war (p = 0,02). Dieser SNP und 3 weitere in den Genen RAD51B, CCND1 und FGFR2 wurden ausgewählt, um gemeinsam in einem Prädiktionsmodell Triple-Negativität vorherzusagen. Die Hinzunahme dieser 4 SNPs erhöhte die kreuzvalidierte AUC von 0,618 auf 0,625. Alter bei Diagnose war bei Weitem der stärkste Prädiktor. Schlussfolgerung Die Vorhersage von triple-negativem Mammakarzinom kann verbessert werden, wenn sie nicht nur auf den klinischen Prädiktoren Alter bei Diagnose und BMI basiert, sondern auch auf Brustkrebsrisiko-SNPs. Das Prädiktionsmodell könnte bei der Rekrutierung von Patientinnen für aufwendige molekulare Therapiestudien eingesetzt werden.

[1]  M. Lux,et al.  Mammographic density is the main correlate of tumors detected on ultrasound but not on mammography , 2016, International journal of cancer.

[2]  Michael Jones,et al.  Age- and Tumor Subtype-Specific Breast Cancer Risk Estimates for CHEK2*1100delC Carriers. , 2016, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[3]  Matthias Schmid,et al.  Approaches to Regularized Regression – A Comparison between Gradient Boosting and the Lasso , 2016, Methods of Information in Medicine.

[4]  Jane E. Carpenter,et al.  Identification of four novel susceptibility loci for oestrogen receptor negative breast cancer , 2016, Nature Communications.

[5]  Nicholas A. Sinnott-Armstrong,et al.  Breast cancer risk variants at 6q25 display different phenotype associations and regulate ESR1, RMND1 and CCDC170 , 2016, Nature Genetics.

[6]  R. Finn,et al.  Targeting the cyclin-dependent kinases (CDK) 4/6 in estrogen receptor-positive breast cancers , 2016, Breast Cancer Research.

[7]  R. Schulz-Wendtland,et al.  Hormone Therapy and its Effect on the Prognosis in Breast Cancer Patients , 2015, Geburtshilfe und Frauenheilkunde.

[8]  Jaana M. Hartikainen,et al.  Common germline polymorphisms associated with breast cancer-specific survival , 2015, Breast Cancer Research.

[9]  Patrick Neven,et al.  Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer , 2015 .

[10]  Brigitte Rack,et al.  Inherited mutations in 17 breast cancer susceptibility genes among a large triple-negative breast cancer cohort unselected for family history of breast cancer. , 2015, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[11]  M. Lux,et al.  Association of molecular subtypes with breast cancer risk factors: a case-only analysis , 2014, European journal of cancer prevention : the official journal of the European Cancer Prevention Organisation.

[12]  R. Schulz-Wendtland,et al.  Pooled analysis of the prognostic relevance of progesterone receptor status in five German cohort studies , 2014, Breast Cancer Research and Treatment.

[13]  Jaana M. Hartikainen,et al.  Common non-synonymous SNPs associated with breast cancer susceptibility: findings from the Breast Cancer Association Consortium , 2014, Human molecular genetics.

[14]  William Wheeler,et al.  Rare variants of large effect in BRCA2 and CHEK2 affect risk of lung cancer , 2014, Nature Genetics.

[15]  Jane E. Carpenter,et al.  Genome-wide association study identifies 25 known breast cancer susceptibility loci as risk factors for triple-negative breast cancer. , 2014, Carcinogenesis.

[16]  N. Hu,et al.  RAD51B Activity and Cell Cycle Regulation in Response to DNA Damage in Breast Cancer Cell Lines , 2014, Breast cancer : basic and clinical research.

[17]  Wei Lu,et al.  Fine-scale mapping of the FGFR2 breast cancer risk locus: putative functional variants differentially bind FOXA1 and E2F1. , 2013, American journal of human genetics.

[18]  M. Bani,et al.  Breast Cancer Risk - From Genetics to Molecular Understanding of Pathogenesis. , 2013, Geburtshilfe und Frauenheilkunde.

[19]  Wei Lu,et al.  Functional variants at the 11q13 risk locus for breast cancer regulate cyclin D1 expression through long-range enhancers. , 2013, American journal of human genetics.

[20]  Wei Lu,et al.  Multiple independent variants at the TERT locus are associated with telomere length and risks of breast and ovarian cancer , 2013, Nature Genetics.

[21]  Peter Kraft,et al.  Fine-mapping identifies multiple prostate cancer risk loci at 5p15, one of which associates with TERT expression , 2013, Human molecular genetics.

[22]  Jaana M. Hartikainen,et al.  Large-scale genotyping identifies 41 new loci associated with breast cancer risk , 2013, Nature Genetics.

[23]  Patrick Neven,et al.  Genome-wide association studies identify four ER negative–specific breast cancer risk loci , 2013, Nature Genetics.

[24]  W. Chung,et al.  Genome-Wide Association Study in BRCA1 Mutation Carriers Identifies Novel Loci Associated with Breast and Ovarian Cancer Risk , 2013, PLoS genetics.

[25]  S. Cross,et al.  CHEK2*1100delC heterozygosity in women with breast cancer associated with early death, breast cancer-specific death, and increased risk of a second breast cancer. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[26]  M. Lux,et al.  Association of mammographic density with hormone receptors in invasive breast cancers: Results from a case‐only study , 2012, International journal of cancer.

[27]  Peter Kraft,et al.  Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array , 2013, Nature Genetics.

[28]  S. Cross,et al.  9q31.2-rs865686 as a Susceptibility Locus for Estrogen Receptor-Positive Breast Cancer: Evidence from the Breast Cancer Association Consortium , 2012, Cancer Epidemiology, Biomarkers & Prevention.

[29]  M Uder,et al.  Percent Mammographic Density and Dense Area as Risk Factors for Breast Cancer. , 2012, Geburtshilfe und Frauenheilkunde.

[30]  Jaana M. Hartikainen,et al.  11q13 is a susceptibility locus for hormone receptor positive breast cancer , 2012, Human mutation.

[31]  Daniel J. Park,et al.  19p13.1 is a triple-negative-specific breast cancer susceptibility locus. , 2012, Cancer research.

[32]  M. Lux,et al.  Characterizing mammographic images by using generic texture features , 2012, Breast Cancer Research.

[33]  M. Woodward,et al.  Risk prediction models: I. Development, internal validation, and assessing the incremental value of a new (bio)marker , 2012, Heart.

[34]  Wei Zheng,et al.  Novel genetic markers of breast cancer survival identified by a genome-wide association study. , 2012, Cancer research.

[35]  Michael Jones,et al.  Genome-wide association analysis identifies three new breast cancer susceptibility loci , 2012, Nature Genetics.

[36]  Jane E. Carpenter,et al.  A common variant at the TERT-CLPTM1L locus is associated with estrogen receptor–negative breast cancer , 2011, Nature Genetics.

[37]  Jane E. Carpenter,et al.  Common breast cancer susceptibility loci are associated with triple-negative breast cancer. , 2011, Cancer research.

[38]  Patrick Neven,et al.  Low penetrance breast cancer susceptibility loci are associated with specific breast tumor subtypes: findings from the Breast Cancer Association Consortium. , 2011, Human molecular genetics.

[39]  Bernard Rosner,et al.  Mammographic breast density and subsequent risk of breast cancer in postmenopausal women according to tumor characteristics. , 2011, Journal of the National Cancer Institute.

[40]  M. Bani,et al.  Quality Assured Health Care in Certified Breast Centers and Improvement of the Prognosis of Breast Cancer Patients , 2011, Oncology Research and Treatment.

[41]  Michael Jones,et al.  Novel breast cancer susceptibility locus at 9q31.2: results of a genome-wide association study. , 2011, Journal of the National Cancer Institute.

[42]  Victoria L. Cafourek,et al.  Associations of breast cancer risk factors with tumor subtypes: a pooled analysis from the Breast Cancer Association Consortium studies. , 2011, Journal of the National Cancer Institute.

[43]  Ewout W Steyerberg,et al.  Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers , 2011, Statistics in medicine.

[44]  Alexander Cavallaro,et al.  Mammographic density as a risk factor for breast cancer in a German case–control study , 2011, European journal of cancer prevention : the official journal of the European Cancer Prevention Organisation.

[45]  Christiana Kartsonaki,et al.  A locus on 19p13 modifies risk of breast cancer in BRCA1 mutation carriers and is associated with hormone receptor–negative breast cancer in the general population , 2010, Nature Genetics.

[46]  David N. Rider,et al.  Evaluation of Candidate Stromal Epithelial Cross-Talk Genes Identifies Association between Risk of Serous Ovarian Cancer and TERT, a Cancer Susceptibility “Hot-Spot” , 2010, PLoS genetics.

[47]  Deborah Hughes,et al.  Genome-wide association study identifies five new breast cancer susceptibility loci , 2010, Nature Genetics.

[48]  M. Beckmann,et al.  Association between a germline OCA2 polymorphism at chromosome 15q13.1 and estrogen receptor-negative breast cancer survival. , 2010, Journal of the National Cancer Institute.

[49]  Torsten Hothorn,et al.  Testing the additional predictive value of high-dimensional molecular data , 2010, BMC Bioinformatics.

[50]  M. Beckmann,et al.  Risk of estrogen receptor-positive and -negative breast cancer and single-nucleotide polymorphism 2q35-rs13387042. , 2009, Journal of the National Cancer Institute.

[51]  R. Gelber,et al.  Thresholds for therapies: highlights of the St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2009 , 2009, Annals of oncology : official journal of the European Society for Medical Oncology.

[52]  W. Willett,et al.  A multistage genome-wide association study in breast cancer identifies two new risk alleles at 1p11.2 and 14q24.1 (RAD51L1) , 2009, Nature Genetics.

[53]  M. Thun,et al.  Newly discovered breast cancer susceptibility loci on 3p24 and 17q23.2 , 2009, Nature Genetics.

[54]  Julian Peto,et al.  Association of ESR1 gene tagging SNPs with breast cancer risk. , 2009, Human molecular genetics.

[55]  John M S Bartlett,et al.  Guidelines for human epidermal growth factor receptor 2 testing: biologic and methodologic considerations. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[56]  J. Haines,et al.  Genome-wide association study identifies a novel breast cancer susceptibility locus at 6q25.1 , 2009, Nature Genetics.

[57]  T Hothorn,et al.  Weight estimation by three‐dimensional ultrasound imaging in the small fetus , 2008, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[58]  Tianxi Cai,et al.  The Performance of Risk Prediction Models , 2008, Biometrical journal. Biometrische Zeitschrift.

[59]  Päivi Heikkilä,et al.  NAD(P)H:quinone oxidoreductase 1 NQO1*2 genotype (P187S) is a strong prognostic and predictive factor in breast cancer , 2008, Nature Genetics.

[60]  A. Sigurdsson,et al.  Common variants on chromosome 5p12 confer susceptibility to estrogen receptor–positive breast cancer , 2008, Nature Genetics.

[61]  Peter Kraft,et al.  Heterogeneity of Breast Cancer Associations with Five Susceptibility Loci by Clinical and Pathological Characteristics , 2008, PLoS genetics.

[62]  M. Pencina,et al.  Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond , 2008, Statistics in medicine.

[63]  Peter Buhlmann,et al.  BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[64]  R. Gelber,et al.  Progress and promise: highlights of the international expert consensus on the primary therapy of early breast cancer 2007. , 2007, Annals of oncology : official journal of the European Society for Medical Oncology.

[65]  D. Gudbjartsson,et al.  Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor–positive breast cancer , 2007, Nature Genetics.

[66]  Lester L. Peters,et al.  Genome-wide association study identifies novel breast cancer susceptibility loci , 2007, Nature.

[67]  Jaana M. Hartikainen,et al.  A common coding variant in CASP8 is associated with breast cancer risk , 2007, Nature Genetics.

[68]  D. Levy,et al.  Multiple biomarkers for the prediction of first major cardiovascular events and death. , 2006, The New England journal of medicine.

[69]  Cor J. Veenman,et al.  A protocol for building and evaluating predictors of disease state based on microarray data , 2005, Bioinform..

[70]  R. Gelber,et al.  Meeting highlights: international expert consensus on the primary therapy of early breast cancer 2005. , 2005, Annals of oncology : official journal of the European Society for Medical Oncology.

[71]  E. Domany,et al.  Genetic and epigenetic changes in early carcinogenesis , 2005, Breast Cancer Research.

[72]  J. Thacker The RAD51 gene family, genetic instability and cancer. , 2005, Cancer letters.

[73]  M. García-Closas,et al.  Etiology of hormone receptor-defined breast cancer: a systematic review of the literature. , 2004, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[74]  Ewout W Steyerberg,et al.  Validation and updating of predictive logistic regression models: a study on sample size and shrinkage , 2004, Statistics in medicine.

[75]  R. Gelber,et al.  Meeting highlights: updated international expert consensus on the primary therapy of early breast cancer. , 2003, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[76]  K. Hess,et al.  Estrogen Receptors and Distinct Patterns of Breast Cancer Relapse , 2003, Breast Cancer Research and Treatment.

[77]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[78]  S C West,et al.  Identification and purification of two distinct complexes containing the five RAD51 paralogs. , 2001, Genes & development.

[79]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[80]  F. Harrell,et al.  Regression models in clinical studies: determining relationships between predictors and response. , 1988, Journal of the National Cancer Institute.

[81]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[82]  M Schumacher,et al.  A bootstrap resampling procedure for model building: application to the Cox regression model. , 1992, Statistics in medicine.

[83]  Arnoldo Frigessi,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm305 Gene expression Predicting survival from microarray data—a comparative study , 2022 .