Chromosomal imbalances detected via RNA-sequencing in 28 cancers

Abstract Motivation RNA-sequencing (RNA-seq) of tumor tissue is typically only used to measure gene expression. Here, we present a statistical approach that leverages existing RNA-seq data to also detect somatic copy number alterations (SCNAs), a pervasive phenomenon in human cancers, without a need to sequence the corresponding DNA. Results We present an analysis of 4942 participant samples from 28 cancers in The Cancer Genome Atlas (TCGA), demonstrating robust detection of SCNAs from RNA-seq. Using genotype imputation and haplotype information, our RNA-based method had a median sensitivity of 85% to detect SCNAs defined by DNA analysis, at high specificity (∼95%). As an example of translational potential, we successfully replicated SCNA features associated with breast cancer subtypes. Our results credential haplotype-based inference based on RNA-seq to detect SCNAs in clinical and population-based settings. Availability and implementation The analyses presented use the data publicly available from TCGA Research Network (http://cancergenome.nih.gov/). See Methods for details regarding data downloads. hapLOHseq software is freely available under The MIT license and can be downloaded from http://scheet.org/software.html. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  N. Navin,et al.  Delineating copy number and clonal substructure in human tumors from single-cell transcriptomes , 2021, Nature Biotechnology.

[2]  Benjamin D. Heavner,et al.  Whole genome sequence analyses of eGFR in 23,732 people representing multiple ancestries in the NHLBI trans-omics for precision medicine (TOPMed) consortium. , 2021, EBioMedicine.

[3]  Nicolai J. Birkbak,et al.  Pervasive chromosomal instability and karyotype order in tumour evolution , 2020, Nature.

[4]  A. Oshlack,et al.  Detecting copy number alterations in RNA-Seq using SuperFreq , 2020, bioRxiv.

[5]  Dirk P. Kroese,et al.  Chromosome arm aneuploidies shape tumour evolution and drug response , 2020, Nature Communications.

[6]  Akdes Serin Harmanci,et al.  CaSpER identifies and visualizes CNV events by integrative analysis of single-cell or bulk RNA-sequencing data , 2020, Nature Communications.

[7]  P. Scheet,et al.  Large-scale analysis of acquired chromosomal alterations in non-tumor samples from patients with cancer , 2019, Nature Biotechnology.

[8]  J. Grimsby,et al.  RNA sequence analysis reveals macroscopic somatic clonal expansion across normal tissues , 2019, Science.

[9]  M. Loda,et al.  Aneuploidy drives lethal progression in prostate cancer , 2019, Proceedings of the National Academy of Sciences.

[10]  P. Scheet,et al.  Pan cancer patterns of allelic imbalance from chromosomal aberrations in 33 tumor types , 2019, bioRxiv.

[11]  Brian E. Cade,et al.  Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program , 2019, Nature.

[12]  Jerry Fowler,et al.  System for Quality‐Assured Data Analysis: Flexible, reproducible scientific workflows , 2019, Genetic epidemiology.

[13]  Jack Cuzick,et al.  Tumor copy number alteration burden is a pan-cancer prognostic factor associated with recurrence and death , 2018, eLife.

[14]  Peter J Park,et al.  Linking transcriptional and genetic tumor heterogeneity through allele analysis of single-cell RNA-seq data , 2018, Genome research.

[15]  Yakir A Reshef,et al.  Insights about clonal hematopoiesis from 8,342 mosaic chromosomal alterations , 2018, Nature.

[16]  Ashton C. Berger,et al.  Genomic and Functional Approaches to Understanding Cancer Aneuploidy. , 2018, Cancer cell.

[17]  V. Iyer,et al.  Detection and benchmarking of somatic mutations in cancer genomes using RNA-seq data , 2018, bioRxiv.

[18]  J. Soulier,et al.  Copy-number analysis identified new prognostic marker in acute myeloid leukemia , 2017, Leukemia.

[19]  Alan M. Kwong,et al.  Next-generation genotype imputation service and methods , 2016, Nature Genetics.

[20]  Smruthy Sivakumar,et al.  Rapid and powerful detection of subtle allelic imbalance from exome sequencing data with hapLOHseq , 2016, Bioinform..

[21]  Huanbin Wang,et al.  Somatic gene copy number alterations in colorectal cancer: new quest for cancer drivers and biomarkers , 2016, Oncogene.

[22]  Jie Xu,et al.  Gastric cancer and gene copy number variation: emerging cancer drivers for targeted therapy , 2016, Oncogene.

[23]  Paul Scheet,et al.  Extensive Hidden Genomic Mosaicism Revealed in Normal Tissue. , 2016, American journal of human genetics.

[24]  X. Bian,et al.  Large-scale RNA-Seq Transcriptome Analysis of 4043 Cancers and 548 Normal Tissue Controls across 12 TCGA Cancer Types , 2015, Scientific Reports.

[25]  Obi L. Griffith,et al.  Optimizing cancer genome sequencing and analysis. , 2015, Cell systems.

[26]  S. Halgamuge,et al.  Inferring copy number and genotype in tumour exome data , 2014, BMC Genomics.

[27]  L. Staudt,et al.  Genome-wide copy-number analyses reveal genomic abnormalities involved in transformation of follicular lymphoma. , 2014, Blood.

[28]  David T. W. Jones,et al.  Signatures of mutational processes in human cancer , 2013, Nature.

[29]  Matthew D. Shirley,et al.  Sensitive and specific detection of mosaic chromosomal abnormalities using the Parent-of-Origin-based Detection (POD) method , 2013, BMC Genomics.

[30]  Paul Scheet,et al.  Haplotype-based profiling of subtle allelic imbalance with SNP arrays , 2013, Genome research.

[31]  T. Ried,et al.  The consequences of chromosomal aneuploidy on the transcriptome of cancer cells. , 2012, Biochimica et biophysica acta.

[32]  Christopher A. Miller,et al.  VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. , 2012, Genome research.

[33]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[34]  G. Abecasis,et al.  MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes , 2010, Genetic epidemiology.

[35]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[36]  T. Halazonetis,et al.  Genomic instability — an evolving hallmark of cancer , 2010, Nature Reviews Molecular Cell Biology.

[37]  Ryan D. Morin,et al.  Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution , 2009, Nature.

[38]  J. Kitzman,et al.  Personalized Copy-Number and Segmental Duplication Maps using Next-Generation Sequencing , 2009, Nature Genetics.

[39]  Joshua M. Korn,et al.  Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs , 2008, Nature Genetics.

[40]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[41]  Carlos Caldas,et al.  Identification and validation of prognostic markers in breast cancer with the complementary use of array‐CGH and tissue microarrays , 2005, The Journal of pathology.

[42]  D. Pinkel,et al.  Genomic Alterations in Primary Gastric Adenocarcinomas Correlate with Clinicopathological Characteristics and Survival , 2004, Cellular oncology : the official journal of the International Society for Cellular Oncology.

[43]  A. Knudson Mutation and cancer: statistical study of retinoblastoma. , 1971, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[45]  Ryan D. Morin,et al.  Whole transcriptome sequencing reveals recurrent NOTCH1 mutations in mantle cell lymphoma. , 2012, Blood.