A Combinatorial Approach for Single-cell Variant Detection via Phylogenetic Inference

Single-cell sequencing provides a powerful approach for elucidating intratumor heterogeneity by resolving cell-to-cell variability. However, it also poses additional challenges including elevated error rates, allelic dropout and non-uniform coverage. A recently introduced single-cell-specific mutation detection algorithm leverages the evolutionary relationship between cells for denoising the data. However, due to its probabilistic nature, this method does not scale well with the number of cells. Here, we develop a novel combinatorial approach for utilizing the genealogical relationship of cells in detecting mutations from noisy single-cell sequencing data. Our method, called scVILP, jointly detects mutations in individual cells and reconstructs a perfect phylogeny among these cells. We employ a novel Integer Linear Program algorithm for deterministically and efficiently solving the joint inference problem. We show that scVILP achieves similar or better accuracy but significantly better runtime over existing methods on simulated data. We also applied scVILP to an empirical human cancer dataset from a high grade serous ovarian cancer patient.

[1]  B. Tjaden,et al.  De novo assembly of bacterial transcriptomes from RNA-seq data , 2015, Genome Biology.

[2]  C. Swanton Intratumor heterogeneity: evolution through space and time. , 2012, Cancer research.

[3]  N. Navin,et al.  Advances and applications of single-cell sequencing technologies. , 2015, Molecular cell.

[4]  Faraz Hach,et al.  PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data , 2019, Genome Research.

[5]  F. Markowetz,et al.  Cancer Evolution: Mathematical Models and Computational Inference , 2014, Systematic biology.

[6]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[7]  Jack Kuipers,et al.  Single-cell mutation identification via phylogenetic inference , 2018, Nature Communications.

[8]  M. Gerstung,et al.  Reliable detection of subclonal single-nucleotide variants in tumour cell populations , 2012, Nature Communications.

[9]  Daniel G. Brown,et al.  Integer Programming Formulations and Computations Solving Phylogenetic and Population Genetic Problems with Missing or Genotypic Data , 2007, COCOON.

[10]  P. Nowell The clonal evolution of tumor cell populations. , 1976, Science.

[11]  Jack Kuipers,et al.  Tree inference for single-cell data , 2016 .

[12]  R. Gillies,et al.  Evolutionary dynamics of carcinogenesis and why targeted therapy does not work , 2012, Nature Reviews Cancer.

[13]  Ali Bashashati,et al.  Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer , 2016, Nature Genetics.

[14]  N. Navin,et al.  The first five years of single-cell cancer genomics and beyond , 2015, Genome research.

[15]  A. Bouchard-Côté,et al.  PyClone: statistical inference of clonal population structure in cancer , 2014, Nature Methods.

[16]  David Haussler,et al.  The infinite sites model of genome evolution , 2008, Proceedings of the National Academy of Sciences.

[17]  C. Maley,et al.  Cancer is a disease of clonal evolution within the body1–3. This has profound clinical implications for neoplastic progression, cancer prevention and cancer therapy. Although the idea of cancer as an evolutionary problem , 2006 .

[18]  Charles Swanton,et al.  Tumour heterogeneity and the evolution of polyclonal drug resistance , 2014, Molecular oncology.

[19]  Charles M. Perou,et al.  Abstract 4875: Human HER2 and PI3K H1047R cooperate to promote mammary tumorigenesis in vivo , 2012 .

[20]  Markus Chimani,et al.  Exact ILP solutions for phylogenetic minimum flip problems , 2010, BCB '10.

[21]  Ken Chen,et al.  Computational approaches for inferring tumor evolution from single-cell genomic data , 2018 .

[22]  Jack Kuipers,et al.  Single-cell sequencing data reveal widespread recurrence and loss of mutational hits in the life histories of tumors , 2017, Genome research.

[23]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[24]  Tao Wang,et al.  Accurate identification of single nucleotide variants in whole genome amplified single cells , 2017, Nature Methods.

[25]  W. Koh,et al.  Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics , 2014, Proceedings of the National Academy of Sciences.

[26]  Yong Wang,et al.  Single-cell DNA sequencing reveals a late-dissemination model in metastatic colorectal cancer , 2017, Genome research.

[27]  Faraz Hach,et al.  PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data , 2018, Genome Research.

[28]  Donna Neuberg,et al.  Integrated single-cell genetic and transcriptional analysis suggests novel drivers of chronic lymphocytic leukemia , 2017, Genome research.

[29]  C. Tyler-Smith,et al.  Ancient DNA and the rewriting of human history: be sparing with Occam’s razor , 2016, Genome Biology.

[30]  F. Dean,et al.  Rapid amplification of plasmid and phage DNA using Phi 29 DNA polymerase and multiply-primed rolling circle amplification. , 2001, Genome research.

[31]  N. Navin Cancer genomics: one cell at a time , 2014, Genome Biology.

[32]  Ken Chen,et al.  SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models , 2017, Genome Biology.

[33]  Ken Chen,et al.  Monovar: single nucleotide variant detection in single cells , 2016, Nature Methods.

[34]  Luay Nakhleh,et al.  SiCloneFit: Bayesian inference of population structure, genotype, and phylogeny of tumor clones from single-cell genome sequencing data , 2019, Genome Research.