Identification of Copy Number Aberrations in Breast Cancer Subtypes Using Persistence Topology

DNA copy number aberrations (CNAs) are of biological and medical interest because they help identify regulatory mechanisms underlying tumor initiation and evolution. Identification of tumor-driving CNAs (driver CNAs) however remains a challenging task, because they are frequently hidden by CNAs that are the product of random events that take place during tumor evolution. Experimental detection of CNAs is commonly accomplished through array comparative genomic hybridization (aCGH) assays followed by supervised and/or unsupervised statistical methods that combine the segmented profiles of all patients to identify driver CNAs. Here, we extend a previously-presented supervised algorithm for the identification of CNAs that is based on a topological representation of the data. Our method associates a two-dimensional (2D) point cloud with each aCGH profile and generates a sequence of simplicial complexes, mathematical objects that generalize the concept of a graph. This representation of the data permits segmenting the data at different resolutions and identifying CNAs by interrogating the topological properties of these simplicial complexes. We tested our approach on a published dataset with the goal of identifying specific breast cancer CNAs associated with specific molecular subtypes. Identification of CNAs associated with each subtype was performed by analyzing each subtype separately from the others and by taking the rest of the subtypes as the control. Our results found a new amplification in 11q at the location of the progesterone receptor in the Luminal A subtype. Aberrations in the Luminal B subtype were found only upon removal of the basal-like subtype from the control set. Under those conditions, all regions found in the original publication, except for 17q, were confirmed; all aberrations, except those in chromosome arms 8q and 12q were confirmed in the basal-like subtype. These two chromosome arms, however, were detected only upon removal of three patients with exceedingly large copy number values. More importantly, we detected 10 and 21 additional regions in the Luminal B and basal-like subtypes, respectively. Most of the additional regions were either validated on an independent dataset and/or using GISTIC. Furthermore, we found three new CNAs in the basal-like subtype: a combination of gains and losses in 1p, a gain in 2p and a loss in 14q. Based on these results, we suggest that topological approaches that incorporate multiresolution analyses and that interrogate topological properties of the data can help in the identification of copy number changes in cancer.

[1]  I. Bièche,et al.  Deletion mapping of chromosomal region 1p32‐pter in primary breast cancer , 1999, Genes, chromosomes & cancer.

[2]  A. Ashworth,et al.  Loss of 16q in high grade breast cancer is associated with estrogen receptor status: Evidence for progression in tumors with a luminal phenotype? , 2009, Genes, chromosomes & cancer.

[3]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[4]  Yijun Sun,et al.  Derivation of molecular signatures for breast cancer recurrence prediction using a two-way validation approach , 2010, Breast Cancer Research and Treatment.

[5]  S. Gabriel,et al.  Advances in understanding cancer genomes through second-generation sequencing , 2010, Nature Reviews Genetics.

[6]  Judith Abrams,et al.  Multiple interacting oncogenes on the 8p11-p12 amplicon in human breast cancer. , 2006, Cancer research.

[7]  C Caldas,et al.  High-resolution analysis of chromosome rearrangements on 8p in breast, colon and pancreatic cancer reveals a complex pattern of loss, gain and translocation , 2006, Oncogene.

[8]  Emmanuel Barillot,et al.  Analysis of array CGH data: from signal ratio to gain and loss of DNA regions , 2004, Bioinform..

[9]  N. Haites,et al.  Genomic changes identified by comparative genomic hybridisation in docetaxel-resistant breast cancer cell lines. , 2005, European journal of cancer.

[10]  Peter J. Park,et al.  Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data , 2005, Bioinform..

[11]  M. Wigler,et al.  Circular binary segmentation for the analysis of array-based DNA copy number data. , 2004, Biostatistics.

[12]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Benjamin J. Raphael,et al.  Expanding the computational toolbox for mining cancer genomes , 2014, Nature Reviews Genetics.

[14]  D. Albertson,et al.  Genomic organization of the 8p11-p12 amplicon in three breast cancer cell lines. , 2004, Cancer genetics and cytogenetics.

[15]  Gyan Bhanot,et al.  Amplified Loci on Chromosomes 8 and 17 Predict Early Relapse in ER-Positive Breast Cancers , 2012, PloS one.

[16]  A. Ashworth,et al.  DNA amplifications in breast cancer: genotypic-phenotypic correlations. , 2010, Future oncology.

[17]  J. Climent,et al.  Characterization of breast cancer by array comparative genomic hybridization. , 2007, Biochemistry and cell biology = Biochimie et biologie cellulaire.

[18]  R. Stern,et al.  Breast cancer and the stromal factor. The "prometastatic healing process" hypothesis. , 2011, Medicina.

[19]  Wonshik Han,et al.  Genomic copy number alterations as predictive markers of systemic recurrence in breast cancer , 2008, International journal of cancer.

[20]  P. Opresko,et al.  Role for the Werner syndrome protein in the promotion of tumor cell growth , 2007, Mechanisms of Ageing and Development.

[21]  Marcel J. T. Reinders,et al.  SIRAC: Supervised Identification of Regions of Aberration in aCGH datasets , 2007, BMC Bioinformatics.

[22]  I. Cavalli,et al.  Differential loss of heterozygosity profile on chromosome 3p in ductal and lobular breast carcinomas. , 2012, Human pathology.

[23]  G. Carlsson,et al.  Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival , 2011, Proceedings of the National Academy of Sciences.

[24]  J. Martinez-Climent,et al.  Genomic loss of 18p predicts an adverse clinical outcome in patients with high-risk breast cancer. , 2002, Clinical cancer research : an official journal of the American Association for Cancer Research.

[25]  R. Schlenk,et al.  Clinical relevance of genomic aberrations in homogeneously treated high‐risk stage II/III breast cancer patients , 2001, International journal of cancer.

[26]  R. Tibshirani,et al.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[27]  S. Drăghici,et al.  Transforming function of the LSM1 oncogene in human breast cancers with the 8p11–12 amplicon , 2007, Oncogene.

[28]  K. Shadan,et al.  Available online: , 2012 .

[29]  M. Reinders,et al.  KC-SMARTR: An R package for detection of statistically significant aberrations in multi-experiment aCGH data , 2010, BMC Research Notes.

[30]  T. Chan,et al.  Deletion of Ptprd and Cdkn2a cooperate to accelerate tumorigenesis , 2014, Oncotarget.

[31]  John M S Bartlett,et al.  Recommendations for human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists clinical practice guideline update. , 2014, Archives of pathology & laboratory medicine.

[32]  Robert Tibshirani,et al.  Distinct patterns of DNA copy number alteration are associated with different clinicopathological features and gene‐expression subtypes of breast cancer , 2006, Genes, chromosomes & cancer.

[33]  J. Climent,et al.  Applications of computational homology to the analysis of treatment response in breast cancer patients , 2010 .

[34]  Hugo M. Horlings,et al.  Integrative molecular profiling of triple negative breast cancers identifies amplicon drivers and potential therapeutic targets , 2009, Oncogene.

[35]  J. Pollack,et al.  Genomic instability in breast cancer: Pathogenesis and clinical implications , 2010, Molecular oncology.

[36]  Barbara J. Trask,et al.  Array Comparative Genomic Hybridization Analysis of Genomic Alterations in Breast Cancer Subtypes , 2004, Cancer Research.

[37]  John Boyle,et al.  Methods for visual mining of genomic and proteomic data atlases , 2012, BMC Bioinformatics.

[38]  D. Metzler,et al.  Gene expression of ceramide kinase, galactosyl ceramide synthase and ganglioside GD3 synthase is associated with prognosis in breast cancer , 2009, Journal of Cancer Research and Clinical Oncology.

[39]  Aleix Prat Aparicio Comprehensive molecular portraits of human breast tumours , 2012 .

[40]  Ajay N. Jain,et al.  Hidden Markov models approach to the analysis of array CGH data , 2004 .

[41]  Roman Rouzier,et al.  Breast Cancer Molecular Subtypes Respond Differently to Preoperative Chemotherapy , 2005, Clinical Cancer Research.

[42]  Elizabeth Garrett-Mayer,et al.  KAT6A, a Chromatin Modifier from the 8p11-p12 Amplicon is a Candidate Oncogene in Luminal Breast Cancer , 2014, Neoplasia.

[43]  Sheng Tan,et al.  Loss of SNAIL regulated miR-128-2 on chromosome 3p22.3 targets multiple stem cell factors to promote transformation of mammary epithelial cells. , 2012, Cancer research.

[44]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[45]  D. Pinkel,et al.  Array comparative genomic hybridization and its applications in cancer , 2005, Nature Genetics.

[46]  D. Bell Our changing view of the genomic landscape of cancer , 2009, The Journal of pathology.

[47]  M. Reinders,et al.  Identification of cancer genes using a statistical framework for multi-experiment analysis of non-discretized array CGH data , 2008, Nucleic Acids Research.

[48]  Daniel Birnbaum,et al.  Integrated profiling of basal and luminal breast cancers. , 2007, Cancer research.

[49]  M Schwab,et al.  Two regions of deletion in 9p23-24 in sporadic breast cancer. , 1999, Cancer research.

[50]  M. Ringnér,et al.  Genomic subtypes of breast cancer identified by array-comparative genomic hybridization display distinct molecular and clinical characteristics , 2010, Breast Cancer Research.

[51]  Jose A. Perea,et al.  Sliding Windows and Persistence: An Application of Topological Methods to Signal Analysis , 2013, Found. Comput. Math..

[52]  Christos Sotiriou,et al.  Luminal B breast cancer: molecular characterization, clinical management, and future perspectives. , 2014, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[53]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[54]  Tara L. Naylor,et al.  Distinct genomic profiles in hereditary breast tumors identified by array-based comparative genomic hybridization. , 2005, Cancer research.

[55]  B. Ljung,et al.  HER2 testing in breast cancer: NCCN Task Force report and recommendations. , 2006, Journal of the National Comprehensive Cancer Network : JNCCN.

[56]  Gurpreet W. Tang,et al.  COSMIC (the Catalogue of Somatic Mutations in Cancer): a resource to investigate acquired mutations in human cancer , 2009, Nucleic acids research.

[57]  H. Edelsbrunner,et al.  Persistent Homology — a Survey , 2022 .

[58]  Douglas Grove,et al.  Denoising array-based comparative genomic hybridization data using wavelets. , 2005, Biostatistics.

[59]  R. Brosh,et al.  Disease-causing missense mutations in human DNA helicase disorders. , 2013, Mutation research.

[60]  Charles M Perou,et al.  Systems biology and genomics of breast cancer. , 2011, Cold Spring Harbor perspectives in biology.

[61]  M. Wicha,et al.  Identification and functional analysis of 9p24 amplified genes in human breast cancer , 2012, Oncogene.

[62]  Donna Albertson,et al.  Genomic and Expression Analysis of the 8p11–12 Amplicon in Human Breast Cancer Cell Lines , 2004, Cancer Research.

[63]  T. Jones,et al.  A siRNA screen identifies RAD21, EIF3H, CHRAC1 and TANC2 as driver genes within the 8q23, 8q24.3 and 17q23 amplicons in breast cancer with effects on cell growth, survival and transformation. , 2014, Carcinogenesis.

[64]  Jorge S. Reis-Filho,et al.  Molecular Cytogenetic Identification of Subgroups of Grade III Invasive Ductal Breast Carcinomas with Different Clinical Outcomes , 2004, Clinical Cancer Research.

[65]  P. Deloukas,et al.  Signatures of mutation and selection in the cancer genome , 2010, Nature.

[66]  Marcel J T Reinders,et al.  Imaging , Diagnosis , Prognosis Clinical Cancer Research Integration of DNA Copy Number Alterations and Prognostic Gene Expression Signatures in Breast Cancer Patients , 2010 .

[67]  Jeroen de Ridder,et al.  Identification of cancer genes using a statistical framework for multiexperiment analysis of nondiscretized array CGH data , 2008, Nucleic acids research.

[68]  J. Davison,et al.  Genomic differences between estrogen receptor (ER)‐positive and ER‐negative human breast carcinoma identified by single nucleotide polymorphism array comparative genome hybridization analysis , 2011, Cancer.

[69]  A. Vincent-Salomon,et al.  Identification of typical medullary breast carcinoma as a genomic sub-group of basal-like carcinomas, a heterogeneous new molecular entity , 2007, Breast Cancer Research.

[70]  Ajay N. Jain,et al.  Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. , 2006, Cancer cell.

[71]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[72]  Edwin Wang,et al.  Understanding genomic alterations in cancer genomes using an integrative network approach. , 2013, Cancer letters.

[73]  Y. Niu,et al.  Genetic changes at specific stages of breast cancer progression detected by comparative genomic hybridization , 2009, Journal of Molecular Medicine.

[74]  Wonshik Han,et al.  Genomic alterations identified by array comparative genomic hybridization as prognostic markers in tamoxifen-treated estrogen receptor-positive breast cancer , 2006, BMC Cancer.

[75]  C Caldas,et al.  Using array-comparative genomic hybridization to define molecular portraits of primary breast cancers , 2007, Oncogene.

[76]  J. Reis-Filho,et al.  Breast cancer precursors revisited: molecular features and progression pathways , 2010, Histopathology.

[77]  R. Tibshirani,et al.  Repeated observation of breast tumor subtypes in independent gene expression data sets , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[78]  Ajay N. Jain,et al.  Breast tumor copy number aberration phenotypes and genomic instability , 2006, BMC Cancer.

[79]  S A Forbes,et al.  The Catalogue of Somatic Mutations in Cancer (COSMIC) , 2008, Current protocols in human genetics.

[80]  Derek Y. Chiang,et al.  The landscape of somatic copy-number alteration across human cancers , 2010, Nature.

[81]  C. Brennan,et al.  Loss of the tyrosine phosphatase PTPRD leads to aberrant STAT3 activation and promotes gliomagenesis , 2014, Proceedings of the National Academy of Sciences.

[82]  J. Fridlyand,et al.  Deletion of chromosome 11q predicts response to anthracycline-based chemotherapy in early breast cancer. , 2007, Cancer research.

[83]  Jason Li,et al.  Studying cancer genomics through next-generation DNA sequencing and bioinformatics. , 2014, Methods in molecular biology.

[84]  P. Campbell,et al.  Next-generation sequencing in breast cancer: first take home messages , 2012, Current opinion in oncology.

[85]  M. Ringnér,et al.  High-resolution genomic and expression analyses of copy number alterations in HER2-amplified breast cancer , 2010, Breast Cancer Research.

[86]  P. Nederlof,et al.  Genomic signature of BRCA1 deficiency in sporadic basal‐like breast tumors , 2011, Genes, chromosomes & cancer.

[87]  M. Majchrzak,et al.  Ceramide galactosyltransferase (UGT8) is a molecular marker of breast cancer malignancy and lung metastases , 2010, British Journal of Cancer.

[88]  Raymond R Tubbs,et al.  HER-2 testing in breast cancer. , 2004 .

[89]  Nils A. Baas,et al.  Topological analysis of gene expression arrays identifies high risk molecular subtypes in breast cancer , 2012, Applicable Algebra in Engineering, Communication and Computing.