ddClone: joint statistical inference of clonal populations from single cell and bulk tumour sequencing data

Next-generation sequencing (NGS) of bulk tumour tissue can identify constituent cell populations in cancers and measure their abundance. This requires computational deconvolution of allelic counts from somatic mutations, which may be incapable of fully resolving the underlying population structure. Single cell sequencing (SCS) is a more direct method, although its replacement of NGS is impeded by technical noise and sampling limitations. We propose ddClone, which analytically integrates NGS and SCS data, leveraging their complementary attributes through joint statistical inference. We show on real and simulated datasets that ddClone produces more accurate results than can be achieved by either method alone.

[1]  N. Navin,et al.  Advances and applications of single-cell sequencing technologies. , 2015, Molecular cell.

[2]  N. Navin Cancer genomics: one cell at a time , 2014, Genome Biology.

[3]  Charles Gawad,et al.  A Quantitative Comparison of Single-Cell Whole Genome Amplification Methods , 2014, PloS one.

[4]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[5]  Wendy S. W. Wong,et al.  Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs , 2012, Bioinform..

[6]  Shankar Vembu,et al.  Inferring clonal evolution of tumors from single nucleotide somatic mutations , 2012, BMC Bioinformatics.

[7]  Junfeng Wang,et al.  Inferring Clonal Composition from Multiple Sections of a Breast Cancer , 2014, PLoS Comput. Biol..

[8]  Maxim Teslenko,et al.  MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space , 2012, Systematic biology.

[9]  W. Koh,et al.  Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics , 2014, Proceedings of the National Academy of Sciences.

[10]  Alexandre Bouchard-Côté,et al.  Clonal genotype and population structure inference from single-cell tumor sequencing , 2016, Nature Methods.

[11]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[12]  W. Koh,et al.  Single-cell genome sequencing: current state of the science , 2016, Nature Reviews Genetics.

[13]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[14]  P. Nowell The clonal evolution of tumor cell populations. , 1976, Science.

[15]  Geoffrey I. Webb,et al.  Encyclopedia of Machine Learning , 2011, Encyclopedia of Machine Learning.

[16]  Benjamin J. Raphael,et al.  Reconstruction of clonal trees and tumor composition from multi-sample sequencing data , 2015, Bioinform..

[17]  Sohrab P. Shah,et al.  Dynamics of genomic clones in breast cancer patient xenografts at single-cell resolution , 2014, Nature.

[18]  Joseph Felsenstein,et al.  DISTANCE METHODS FOR INFERRING PHYLOGENIES: A JUSTIFICATION , 1984, Evolution; international journal of organic evolution.

[19]  J. Troge,et al.  Tumour evolution inferred by single-cell sequencing , 2011, Nature.

[20]  Florian Markowetz,et al.  OncoNEM: inferring tumor evolution from single-cell sequencing data , 2016, Genome Biology.

[21]  A. Merotto,et al.  Recurrent evolution of heat-responsiveness in Brassicaceae COPIA elements , 2016, Genome Biology.

[22]  Benjamin J. Raphael,et al.  Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures. , 2016, Cell systems.

[23]  M. Tanner,et al.  Facilitating the Gibbs Sampler: The Gibbs Stopper and the Griddy-Gibbs Sampler , 1992 .

[24]  Ken Chen,et al.  Genotyping tumor clones from single-cell data , 2016, Nature Methods.

[25]  K. Ickstadt,et al.  Improved criteria for clustering based on the posterior similarity matrix , 2009 .

[26]  A. Sivachenko,et al.  Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples , 2013, Nature Biotechnology.

[27]  Sohrab P. Shah,et al.  JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumour paired next-generation sequencing data , 2012, Bioinform..

[28]  Ali Bashashati,et al.  Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer , 2016, Nature Genetics.

[29]  Peter I. Frazier,et al.  Distance dependent Chinese restaurant processes , 2009, ICML.

[30]  Irmtraud M. Meyer,et al.  The clonal and mutational evolution spectrum of primary triple-negative breast cancers , 2012, Nature.

[31]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[32]  Ken Chen,et al.  Monovar: single nucleotide variant detection in single cells , 2016, Nature Methods.

[33]  A. Bouchard-Côté,et al.  PyClone: statistical inference of clonal population structure in cancer , 2014, Nature Methods.

[34]  Christopher J. Lee,et al.  Wagner and Dollo: a stochastic duet by composing two parsimonious solos. , 2008, Systematic biology.

[35]  C. Tyler-Smith,et al.  Ancient DNA and the rewriting of human history: be sparing with Occam’s razor , 2016, Genome Biology.

[36]  Gholamreza Haffari,et al.  Feature-based classifiers for somatic mutation detection in tumour–normal paired sequencing data , 2011, Bioinform..

[37]  Ryan D. Morin,et al.  Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution , 2009, Nature.

[38]  Jenny Taylor,et al.  Monitoring chronic lymphocytic leukemia progression by whole genome sequencing reveals heterogeneous clonal evolution patterns. , 2012, Blood.

[39]  Michael I. Jordan,et al.  Evolutionary inference via the Poisson Indel Process , 2012, Proceedings of the National Academy of Sciences.