Differential Allele-Specific Expression Uncovers Breast Cancer Genes Dysregulated By Cis Noncoding Mutations

Identifying cancer-relevant mutations in noncoding regions is extremely challenging due to the large numbers of such mutations, their low levels of recurrence, and the general difficulty in interpreting their impact. To uncover genes that are dysregulated due to somatic mutations in cis, we introduce the concept of differential allele-specific expression (ASE) and develop methods to identify genes within an individual’s cancer whose ASE differs from what is found in matched normal tissue. When applied to breast cancer tumor samples, our methods readily detect the known allele-specific effects of copy number variation and nonsense mediated decay. Further, genes that are found to recurrently exhibit differential ASE across samples are cancer relevant. Genes with cis mutations are enriched for differential ASE, and we find 147 potentially functional noncoding mutations cis to genes that exhibit significant differential ASE. Overall, our results suggest that differential ASE is a promising means for discovering gene dysregulation within an individual due to cis noncoding mutations.

[1]  Ville Mustonen,et al.  Correction to: Landscape of somatic mutations in 560 breast cancer whole-genome sequences (Nature, (2016), 534, 7605, (47-54), 10.1038/nature17676) , 2019 .

[2]  F. Supek,et al.  The rules and impact of nonsense-mediated mRNA decay in human cancers , 2016, Nature Genetics.

[3]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[4]  Lynda Chin,et al.  Highly Recurrent TERT Promoter Mutations in Human Melanoma , 2013, Science.

[5]  Xin Zhou,et al.  Pan-cancer genome and transcriptome analyses of 1,699 pediatric leukemias and solid tumors , 2018, Nature.

[6]  M. Cechowska-Pasko,et al.  Low-glucose medium induces ORP150 expression and exerts inhibitory effect on apoptosis and senescence of human breast MCF7 cells. , 2013, Acta biochimica Polonica.

[7]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[8]  K. Gunderson,et al.  High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. , 2006, Genome research.

[9]  T. Hampton,et al.  The Cancer Genome Atlas , 2020, Indian Journal of Medical and Paediatric Oncology.

[10]  J. Schneider,et al.  Nup88 mRNA overexpression is associated with high aggressiveness of breast cancer , 2004, International journal of cancer.

[11]  Gary D Bader,et al.  International network of cancer genome projects , 2010, Nature.

[12]  M. Gerstein,et al.  Role of non-coding sequence variants in cancer , 2016, Nature Reviews Genetics.

[13]  Justin K. Huang,et al.  A global transcriptional network connecting noncoding mutations to changes in tumor gene expression , 2018, Nature Genetics.

[14]  David R. Kelley,et al.  Recurrent noncoding regulatory mutations in pancreatic ductal adenocarcinoma , 2017, Nature Genetics.

[15]  Jun S. Song,et al.  Integrative Genomic Analysis Predicts Causative Cis-Regulatory Mechanisms of the Breast Cancer-Associated Genetic Variant rs4415084. , 2018, Cancer research.

[16]  Stephane E. Castel,et al.  Quantifying the regulatory effect size of cis-acting genetic variation using allelic fold change , 2016, bioRxiv.

[17]  M. Gerstein,et al.  LARVA: an integrative framework for large-scale analysis of recurrent variants in noncoding annotations , 2015, Nucleic acids research.

[18]  J. Iredale,et al.  Overexpression of human kynurenine-3-monooxygenase protects against 3-hydroxykynurenine-mediated apoptosis through bidirectional nonlinear feedback , 2016, Cell Death and Disease.

[19]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.

[20]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..

[21]  L. Hengst,et al.  p27 Phosphorylation by Src Regulates Inhibition of Cyclin E-Cdk2 , 2007, Cell.

[22]  K. Pollard,et al.  Detection of nonneutral substitution rates on mammalian phylogenies. , 2010, Genome research.

[23]  Qing‐Yu He,et al.  Global identification of miR‐373‐regulated genes in breast cancer by quantitative proteomics , 2011, Proteomics.

[24]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[25]  Nicholas A. Sinnott-Armstrong,et al.  Noncoding somatic and inherited single-nucleotide variants converge to promote ESR1 expression in breast cancer , 2016, Nature Genetics.

[26]  klaguia International Network of Cancer Genome Projects , 2010 .

[27]  Sven Diederichs,et al.  The dark matter of the cancer genome: aberrations in regulatory elements, untranslated regions, splice sites, non‐coding RNA and synonymous mutations , 2016, EMBO molecular medicine.

[28]  Mona Singh,et al.  Differential analysis between somatic mutation and germline variation profiles reveals cancer-related genes , 2017, Genome Medicine.

[29]  Washington Seattle An integrated encyclopedia of DNA elements in the human genome , 2016 .

[30]  E. Larsson,et al.  Systematic analysis of noncoding somatic mutations and gene expression alterations across 14 tumor types , 2014, Nature Genetics.

[31]  Benjamin J. Raphael,et al.  Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures. , 2016, Cell systems.

[32]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[33]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[34]  A. Butte,et al.  Systematic pan-cancer analysis of tumour purity , 2015, Nature Communications.

[35]  Steven A. Roberts,et al.  Mutational heterogeneity in cancer and the search for new cancer genes , 2014 .

[36]  E. Lander,et al.  Lessons from the Cancer Genome , 2013, Cell.

[37]  Oleg Mayba,et al.  MBASED: allele-specific expression detection in cancer tissues and cell lines , 2014, Genome Biology.

[38]  Emmanouil T. Dermitzakis,et al.  Putative cis-regulatory drivers in colorectal cancer , 2014, Nature.

[39]  Cameron S. Osborne,et al.  Promoter capture Hi-C-based identification of recurrent noncoding mutations in colorectal cancer , 2018, Nature Genetics.

[40]  Ash A. Alizadeh,et al.  Robust enumeration of cell subsets from tissue expression profiles , 2015, Nature Methods.

[41]  Jorge G. Ferreira,et al.  The tumour suppressor DLC2 ensures mitotic fidelity by coordinating spindle positioning and cell–cell adhesion , 2014, Nature Communications.

[42]  Kevin Y. Yip,et al.  FunSeq2: a framework for prioritizing noncoding regulatory variants in cancer , 2014, Genome Biology.

[43]  K. Crandall,et al.  Systematic pan-cancer analysis of somatic allele frequency , 2018, Scientific reports.

[44]  David C. Jones,et al.  Landscape of somatic mutations in 560 breast cancer whole genome sequences , 2016, Nature.

[45]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[46]  L. Pusztai,et al.  An integrative bioinformatics approach reveals coding and non-coding gene variants associated with gene expression profiles and outcome in breast cancer molecular subtypes , 2018, British Journal of Cancer.

[47]  Obi L. Griffith,et al.  ORegAnno 3.0: a community-driven resource for curated regulatory annotation , 2015, Nucleic Acids Res..

[48]  Mauro A. A. Castro,et al.  The chromatin accessibility landscape of primary human cancers , 2018, Science.

[49]  A. Gonzalez-Perez,et al.  OncodriveFML: a general framework to identify coding and non-coding regions with cancer driver mutations , 2016, Genome Biology.

[50]  Robert Gentleman,et al.  rtracklayer: an R package for interfacing with genome browsers , 2009, Bioinform..

[51]  C. Sander,et al.  Genome-wide analysis of non-coding regulatory mutations in cancer , 2014, Nature Genetics.

[52]  Marc A. Attiyeh,et al.  Genomic copy number determination in cancer cells from single nucleotide polymorphism microarrays based on quantitative genotyping corrected for aneuploidy. , 2008, Genome research.

[53]  Trevor J Pugh,et al.  Recurrent and functional regulatory mutations in breast cancer , 2017, Nature.

[54]  Guangchuang Yu,et al.  ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization. , 2016, Molecular bioSystems.

[55]  P. Zhang,et al.  RhoB Acts as a Tumor Suppressor That Inhibits Malignancy of Clear Cell Renal Cell Carcinoma , 2016, PloS one.

[56]  Alison P. Klein,et al.  Allele-specific expression in the germline of patients with familial pancreatic cancer: An unbiased approach to cancer gene discovery , 2008, Cancer biology & therapy.

[57]  J. Gern The Sequence of the Human Genome , 2001, Science.

[58]  Najeeb M. Halabi,et al.  Preferential Allele Expression Analysis Identifies Shared Germline and Somatic Driver Genes in Advanced Ovarian Cancer , 2016, PLoS genetics.

[59]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[60]  D. Chakravarty,et al.  Transcriptome and Proteome Analyses of TNFAIP8 Knockdown Cancer Cells Reveal New Insights into Molecular Determinants of Cell Survival and Tumor Progression. , 2017, Methods in molecular biology.

[61]  T. Fehm,et al.  Possible role of PGRMC1 in breast cancer development , 2013, Climacteric : the journal of the International Menopause Society.

[62]  Nicholas A. Sinnott-Armstrong,et al.  Pan-cancer screen for mutations in non-coding elements with conservation and cancer specificity reveals correlations with expression and survival , 2017, npj Genomic Medicine.

[63]  A. Sivachenko,et al.  Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples , 2013, Nature Biotechnology.

[64]  Miguel Melo,et al.  Frequency of TERT promoter mutations in human cancers , 2013, Nature Communications.

[65]  Per Eriksson,et al.  AllelicImbalance: an R/ bioconductor package for detecting, managing, and visualizing allele expression imbalance data from RNA sequencing , 2015, BMC Bioinformatics.

[66]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[67]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[68]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[69]  W. Park,et al.  Allelic imbalance of somatic mutations in cancer genomes and transcriptomes , 2017, Scientific Reports.

[70]  Damian Szklarczyk,et al.  The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible , 2016, Nucleic Acids Res..

[71]  De-guang Sun,et al.  Pinin associates with prognosis of hepatocellular carcinoma through promoting cell proliferation and suppressing glucose deprivation-induced apoptosis , 2016, Oncotarget.

[72]  John C. Marioni,et al.  Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data , 2009, Bioinform..

[73]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[74]  Alessandro Romanel,et al.  ASEQ: fast allele-specific studies from next-generation sequencing data , 2015, BMC Medical Genomics.

[75]  F. A. Kolpakov,et al.  HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis , 2017, Nucleic Acids Res..

[76]  M. Snyder,et al.  Recurrent Somatic Mutations in Regulatory Regions of Human Cancer Genomes , 2015, Nature Genetics.