Characterizing the allele- and haplotype-specific copy number landscape of cancer genomes at single-cell resolution with CHISEL

Single-cell barcoding technologies have recently been used to perform whole-genome sequencing of thousands of individual cells in parallel. These technologies provide the opportunity to characterize genomic heterogeneity at single-cell resolution, but their extremely low sequencing coverage (<0.05X per cell) has thus far restricted their use to identification of the total copy number of large multi-megabase segments in individual cells. However, total copy numbers do not distinguish between the two homologous chromosomes in humans, and thus provide a limited view of tumor heterogeneity and evolution missing important events such as copy-neutral loss-of-heterozygosity (LOH). We introduce CHISEL, the first method to infer allele- and haplotype-specific copy numbers in single cells and subpopulations of cells by aggregating sparse signal across thousands of individual cells. We applied CHISEL to 10 single-cell sequencing datasets from 2 breast cancer patients, each dataset containing ≈2000 cells. We identified extensive allele-specific copy-number aberrations (CNAs) in these samples including copy-neutral LOH, whole-genome duplications (WGDs), and mirrored-subclonal CNAs in subpopulations of cells. These allele-specific CNAs alter the copy number of genomic regions containing well-known breast cancer genes including TP53, BRCA2, and PTEN but are invisible to total copy number analysis. We utilized CHISEL’s allele- and haplotype-specific copy numbers to derive a more refined reconstruction of tumor evolution: timing allele-specific CNAs before and after WGDs, identifying low-frequency subclones distinguished by unique CNAs, and uncovering evidence of convergent evolution. This reconstruction is supported by orthogonal analysis of somatic single-nucleotide variants (SNVs) obtained by pooling barcoded reads across clones defined by CHISEL.

[1]  Kieran R. Campbell,et al.  clonealign: statistical integration of independent single-cell RNA and DNA sequencing data from human cancers , 2019, Genome Biology.

[2]  Nicolai J. Birkbak,et al.  Tracking the Evolution of Non‐Small‐Cell Lung Cancer , 2017, The New England journal of medicine.

[3]  N. McGranahan,et al.  Biological and therapeutic impact of intratumor heterogeneity in cancer evolution. , 2015, Cancer cell.

[4]  W. Koh,et al.  Single-cell genome sequencing: current state of the science , 2016, Nature Reviews Genetics.

[5]  Victor Guryev,et al.  Single-cell sequencing reveals karyotype heterogeneity in murine and human malignancies , 2016, Genome Biology.

[6]  Derek Y. Chiang,et al.  The landscape of somatic copy-number alteration across human cancers , 2010, Nature.

[7]  Ken Chen,et al.  VarScan: variant detection in massively parallel sequencing of individual and pooled samples , 2009, Bioinform..

[8]  Marcin J. Skwark,et al.  Improving Contact Prediction along Three Dimensions , 2014, PLoS Comput. Biol..

[9]  K. Kinzler,et al.  Mechanisms underlying losses of heterozygosity in human colorectal cancers , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  J. Vijg,et al.  SCCNV: A Software Tool for Identifying Copy Number Variation From Single-Cell Whole-Genome Sequencing , 2019, bioRxiv.

[11]  S. Swamy,et al.  PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data , 2009, Biostatistics.

[12]  James D. Brenton,et al.  Phylogenetic Quantification of Intra-tumour Heterogeneity , 2013, PLoS Comput. Biol..

[13]  David C Wedge,et al.  Principles of Reconstructing the Subclonal Architecture of Cancers. , 2017, Cold Spring Harbor perspectives in medicine.

[14]  Nancy R. Zhang,et al.  Allele-specific copy number profiling by next-generation DNA sequencing , 2014, Nucleic acids research.

[15]  Benjamin J. Raphael,et al.  Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures. , 2016, Cell systems.

[16]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[17]  Yuchao Jiang,et al.  SCOPE: a normalization and copy number estimation method for single-cell DNA sequencing , 2019, bioRxiv.

[18]  Satoshi O. Suzuki,et al.  Prevalence of copy-number neutral LOH in glioblastomas revealed by genomewide analysis of laser-microdissected tissues. , 2008, Neuro-oncology.

[19]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[20]  Michael A McDevitt,et al.  Copy neutral loss of heterozygosity: a novel chromosomal lesion in myeloid malignancies. , 2010, Blood.

[21]  Michael C. Schatz,et al.  Interactive analysis and assessment of single-cell copy-number variations , 2015, Nature Methods.

[22]  Sohrab P. Shah,et al.  TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data , 2014, Genome research.

[23]  Ron Shamir,et al.  Complexity and algorithms for copy-number evolution problems , 2017, Algorithms for Molecular Biology.

[24]  J. Kench,et al.  Whole genomes redefine the mutational landscape of pancreatic cancer , 2015, Nature.

[25]  R. L. Thorndike Who belongs in the family? , 1953 .

[26]  Alan M. Kwong,et al.  Next-generation genotype imputation service and methods , 2016, Nature Genetics.

[27]  A. McKenna,et al.  Absolute quantification of somatic DNA alterations in human cancer , 2012, Nature Biotechnology.

[28]  J. Troge,et al.  Tumour evolution inferred by single-cell sequencing , 2011, Nature.

[29]  V. Seshan,et al.  FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing , 2016, Nucleic acids research.

[30]  Ashton C. Berger,et al.  Genomic and Functional Approaches to Understanding Cancer Aneuploidy. , 2018, Cancer cell.

[31]  Tom Royce,et al.  A comprehensive catalogue of somatic mutations from a human cancer genome , 2010, Nature.

[32]  Alan M. Kwong,et al.  A reference panel of 64,976 haplotypes for genotype imputation , 2015, Nature Genetics.

[33]  Benjamin J. Raphael,et al.  The evolutionary history of 2,658 cancers , 2017, Nature.

[34]  Feng Jiang,et al.  Distance-Based Reconstruction of Tree Models for Oncogenesis , 2000, J. Comput. Biol..

[35]  Chris Sander,et al.  Emerging landscape of oncogenic signatures across human cancers , 2013, Nature Genetics.

[36]  B. Taylor,et al.  Genome doubling shapes the evolution and prognosis of advanced cancers , 2018, Nature Genetics.

[37]  N. McGranahan,et al.  The causes and consequences of genetic heterogeneity in cancer evolution , 2013, Nature.

[38]  M. Ringnér,et al.  Segmentation-based detection of allelic imbalance and loss-of-heterozygosity in cancer cells using whole genome SNP arrays , 2008, Genome Biology.

[39]  Richard A. Moore,et al.  Resource: Scalable whole genome sequencing of 40,000 single cells identifies stochastic aneuploidies, genome replication states and clonal repertoires , 2018, bioRxiv.

[40]  David C. Jones,et al.  Landscape of somatic mutations in 560 breast cancer whole genome sequences , 2016, Nature.

[41]  P. Lapunzina,et al.  The consequences of uniparental disomy and copy number neutral loss‐of‐heterozygosity during human development and cancer , 2011, Biology of the cell.

[42]  Benjamin J. Raphael,et al.  Accurate quantification of copy-number aberrations and whole-genome duplications in multi-sample tumor sequencing data , 2018, Nature Communications.

[43]  James Y. Dai,et al.  Quantification of Multiple Tumor Clones Using Gene Array and Sequencing Data. , 2017, The annals of applied statistics.

[44]  Hao Chen,et al.  DNA copy number profiling using single‐cell sequencing , 2018, Briefings Bioinform..

[45]  Heng Li,et al.  A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data , 2011, Bioinform..

[46]  Funda Meric-Bernstam,et al.  Punctuated Copy Number Evolution and Clonal Stasis in Triple-Negative Breast Cancer , 2016, Nature Genetics.

[47]  N. Navin,et al.  The first five years of single-cell cancer genomics and beyond , 2015, Genome research.

[48]  Peter J Park,et al.  Linking transcriptional and genetic tumor heterogeneity through allele analysis of single-cell RNA-seq data , 2018, Genome research.

[49]  Russell Schwartz,et al.  Algorithms to Model Single Gene, Single Chromosome, and Whole Genome Copy Number Changes Jointly in Tumor Phylogenetics , 2014, PLoS Comput. Biol..

[50]  S. Gabriel,et al.  Pan-cancer patterns of somatic copy-number alteration , 2013, Nature Genetics.

[51]  D. Ellison,et al.  Combined genome‐wide allelotyping and copy number analysis identify frequent genetic losses without copy number reduction in medulloblastoma , 2006, Genes, chromosomes & cancer.

[52]  A. Børresen-Dale,et al.  The Life History of 21 Breast Cancers , 2012, Cell.

[53]  Noemi Andor,et al.  Joint single cell DNA-Seq and RNA-Seq of gastric cancer reveals subclonal signatures of genomic instability and gene expression , 2018, bioRxiv.

[54]  G. Getz,et al.  Accurate estimation of homologue-specific DNA concentration-ratios in cancer samples allows long-range haplotyping , 2011 .

[55]  Samuel Aparicio,et al.  Scalable whole-genome single-cell library preparation without preamplification , 2017, Nature Methods.

[56]  A. Bashashati,et al.  Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer , 2012, Genome research.

[57]  A. Bouchard-Côté,et al.  PyClone: statistical inference of clonal population structure in cancer , 2014, Nature Methods.

[58]  Yuan Ji,et al.  Portraits of genetic intra-tumour heterogeneity and subclonal selection across cancer types , 2018, bioRxiv.

[59]  E. Kirkness,et al.  Comparison of phasing strategies for whole human genomes , 2018, PLoS genetics.

[60]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[61]  Benjamin J. Raphael,et al.  Phylogenetic Copy-Number Factorization of Multiple Tumor Samples , 2018, J. Comput. Biol..

[62]  C. Perou,et al.  Allele-specific copy number analysis of tumors , 2010, Proceedings of the National Academy of Sciences.

[63]  S. C. Sahinalp,et al.  ReMixT: clone-specific genomic structure estimation in cancer , 2017, Genome Biology.

[64]  Christopher J. R. Illingworth,et al.  High-Definition Reconstruction of Clonal Composition in Cancer , 2014, Cell reports.

[65]  Angela E. Leek,et al.  Allele-Specific HLA Loss and Immune Escape in Lung Cancer Evolution , 2017, Cell.

[66]  Chuong B Do,et al.  What is the expectation maximization algorithm? , 2008, Nature Biotechnology.

[67]  N. Navin,et al.  Clonal Evolution in Breast Cancer Revealed by Single Nucleus Genome Sequencing , 2014, Nature.

[68]  Shane A. McCarthy,et al.  Reference-based phasing using the Haplotype Reference Consortium panel , 2016, Nature Genetics.