Single-cell tumor phylogeny inference with copy-number constrained mutation losses

Motivation Single-cell DNA sequencing enables the measurement of somatic mutations in individual tumor cells, and provides data to reconstruct the evolutionary history of the tumor. Nearly all existing methods to construct phylogenetic trees from single-cell sequencing data use single-nucleotide variants (SNVs) as markers. However, most solid tumors contain copy-number aberrations (CNAs) which can overlap loci containing SNVs. Particularly problematic are CNAs that delete an SNV, thus returning the SNV locus to the unmutated state. Such mutation losses are allowed in some models of SNV evolution, but these models are generally too permissive, allowing mutation losses without evidence of a CNA overlapping the locus. Results We introduce a novel loss-supported evolutionary model, a generalization of the infinite sites and Dollo models, that constrains mutation losses to loci with evidence of a decrease in copy number. We design a new algorithm, Single-Cell Algorithm for Reconstructing the Loss-supported Evolution of Tumors (Scarlet), that infers phylogenies from single-cell tumor sequencing data using the loss-supported model and a probabilistic model of sequencing errors and allele dropout. On simulated data, we show that Scarlet outperforms current single-cell phylogeny methods, recovering more accurate trees and correcting errors in SNV data. On single-cell sequencing data from a metastatic colorectal cancer patient, Scarlet constructs a phylogeny that is both more consistent with the observed copy-number data and also reveals a simpler monooclonal seeding of the metastasis, contrasting with published reports of polyclonal seeding in this patient. Scarlet substantially improves single-cell phylogeny inference in tumors with CNAs, yielding new insights into the analysis of tumor evolution. Availability Software is available at github.com/raphael-group/scarlet Contact braphael@princeton.edu

[1]  N. Navin,et al.  The first five years of single-cell cancer genomics and beyond , 2015, Genome research.

[2]  N. Navin,et al.  Clonal Evolution in Breast Cancer Revealed by Single Nucleus Genome Sequencing , 2014, Nature.

[3]  Ken Chen,et al.  SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models , 2017, Genome Biology.

[4]  Samuel Aparicio,et al.  Scalable whole-genome single-cell library preparation without preamplification , 2017, Nature Methods.

[5]  James D. Brenton,et al.  Phylogenetic Quantification of Intra-tumour Heterogeneity , 2013, PLoS Comput. Biol..

[6]  Faraz Hach,et al.  PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data , 2019, Genome Research.

[7]  F H Giddings,et al.  THE LAWS OF EVOLUTION. , 1905, Science.

[8]  N. Navin,et al.  Highly multiplexed targeted DNA sequencing from single nuclei , 2016, Nature Protocols.

[9]  Paola Bonizzoni,et al.  Inferring cancer progression from Single-Cell Sequencing while allowing mutation losses , 2018, bioRxiv.

[10]  Jack Kuipers,et al.  Integrative inference of subclonal tumour evolution from single-cell and bulk sequencing data , 2019, Nature Communications.

[11]  Dan Gusfield,et al.  Efficient algorithms for inferring evolutionary trees , 1991, Networks.

[12]  David Fernández-Baca,et al.  Minimum-flip supertrees: complexity and algorithms , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Benjamin J. Raphael,et al.  Tumor phylogeny inference using tree-constrained importance sampling , 2017, Bioinform..

[14]  Iman Hajirasouliha,et al.  Fast and scalable inference of multi-sample cancer lineages , 2014, Genome Biology.

[15]  Benjamin J. Raphael,et al.  Phylogenetic Copy-Number Factorization of Multiple Tumor Samples , 2018, J. Comput. Biol..

[16]  Mohammed El-Kebir,et al.  On the Non-uniqueness of Solutions to the Perfect Phylogeny Mixture Problem , 2018, RECOMB-CG.

[18]  W. Koh,et al.  Single-cell genome sequencing: current state of the science , 2016, Nature Reviews Genetics.

[19]  Ali Bashashati,et al.  Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer , 2016, Nature Genetics.

[20]  Ron Shamir,et al.  Complexity and algorithms for copy-number evolution problems , 2017, Algorithms for Molecular Biology.

[21]  Nancy R. Zhang,et al.  Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing , 2016, Proceedings of the National Academy of Sciences.

[22]  Faraz Hach,et al.  PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data , 2018, Genome Research.

[23]  Matthew A Myers,et al.  CALDER: Inferring Phylogenetic Trees from Longitudinal Tumor Samples. , 2019, Cell systems.

[24]  Benjamin J. Raphael,et al.  Reconstruction of clonal trees and tumor composition from multi-sample sequencing data , 2015, Bioinform..

[25]  Mohammed El-Kebir,et al.  SPhyR: tumor phylogeny estimation from single-cell sequencing data under loss and error , 2018, Bioinform..

[26]  Ron Shamir,et al.  Incomplete Directed Perfect Phylogeny , 2000, CPM.

[27]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[28]  D. Posada,et al.  Multiregional Tumor Trees Are Not Phylogenies , 2017, Trends in cancer.

[29]  Shankar Vembu,et al.  Inferring clonal evolution of tumors from single nucleotide somatic mutations , 2012, BMC Bioinformatics.

[30]  David I. Smith,et al.  Common fragile sites, extremely large genes, neural development and cancer. , 2006, Cancer letters.

[31]  Layla Oesper,et al.  A Consensus Approach to Infer Tumor Evolutionary Histories , 2018, BCB.

[32]  Huanming Yang,et al.  Single-Cell Exome Sequencing Reveals Single-Nucleotide Mutation Characteristics of a Kidney Tumor , 2012, Cell.

[33]  B. Taylor,et al.  Genome doubling shapes the evolution and prognosis of advanced cancers , 2018, Nature Genetics.

[34]  N. McGranahan,et al.  The causes and consequences of genetic heterogeneity in cancer evolution , 2013, Nature.

[35]  Jack Kuipers,et al.  Single-cell sequencing data reveal widespread recurrence and loss of mutational hits in the life histories of tumors , 2017, Genome research.

[36]  Luay Nakhleh,et al.  SiCloneFit: Bayesian inference of population structure, genotype, and phylogeny of tumor clones from single-cell genome sequencing data , 2019, Genome Research.

[37]  K. Polyak,et al.  Tumorigenesis: it takes a village , 2015, Nature Reviews Cancer.

[38]  Yong Wang,et al.  Single-cell DNA sequencing reveals a late-dissemination model in metastatic colorectal cancer , 2017, Genome research.

[39]  Nilgun Donmez,et al.  Clonality inference in multiple tumor samples using phylogeny , 2015, Bioinform..

[40]  Sudhir Kumar,et al.  Power and pitfalls of computational methods for inferring clone phylogenies and mutation orders from bulk sequencing data , 2019, bioRxiv.

[41]  Russell Schwartz,et al.  Inferring models of multiscale copy number evolution for single-tumor phylogenetics , 2015, Bioinform..

[42]  Benjamin J. Raphael,et al.  Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures. , 2016, Cell systems.

[43]  Jack Kuipers,et al.  Single-cell mutation identification via phylogenetic inference , 2018, Nature Communications.

[44]  Florian Markowetz,et al.  OncoNEM: inferring tumor evolution from single-cell sequencing data , 2016, Genome Biology.

[45]  Ashton C. Berger,et al.  Genomic and Functional Approaches to Understanding Cancer Aneuploidy. , 2018, Cancer cell.