Clonal genotype and population structure inference from single-cell tumor sequencing

Single-cell DNA sequencing has great potential to reveal the clonal genotypes and population structure of human cancers. However, single-cell data suffer from missing values and biased allelic counts as well as false genotype measurements owing to the sequencing of multiple cells. We describe the Single Cell Genotyper (https://bitbucket.org/aroth85/scg), an open-source software based on a statistical model coupled with a mean-field variational inference method, which can be used to address these problems and robustly infer clonal genotypes.

[1]  A. Bouchard-Côté,et al.  PyClone: statistical inference of clonal population structure in cancer , 2014, Nature Methods.

[2]  Bin Zhang,et al.  Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R , 2008, Bioinform..

[3]  Michael I. Jordan,et al.  Feature allocations, probability functions, and paintboxes , 2013, 1301.6647.

[4]  Yong Hou,et al.  Current Challenges in the Bioinformatics of Single Cell Genomics , 2013, Front. Oncol..

[5]  W. Koh,et al.  Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics , 2014, Proceedings of the National Academy of Sciences.

[6]  Nilgun Donmez,et al.  Clonality inference in multiple tumor samples using phylogeny , 2015, Bioinform..

[7]  J. Troge,et al.  Tumour evolution inferred by single-cell sequencing , 2011, Nature.

[8]  P. A. Futreal,et al.  Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. , 2012, The New England journal of medicine.

[9]  Iman Hajirasouliha,et al.  Fast and scalable inference of multi-sample cancer lineages , 2014, Genome Biology.

[10]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2008, Information Retrieval.

[11]  Carlos Caldas,et al.  Driver mutations in TP53 are ubiquitous in high grade serous carcinoma of the ovary , 2010, The Journal of pathology.

[12]  Ali Bashashati,et al.  Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer , 2016, Nature Genetics.

[13]  Ali Bashashati,et al.  Distinct evolutionary trajectories of primary high-grade serous ovarian cancers revealed through spatial mutational profiling , 2013, The Journal of pathology.

[14]  Ryan D. Morin,et al.  Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution , 2009, Nature.

[15]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[16]  Michael I. Jordan,et al.  Tree-Structured Stick Breaking for Hierarchical Data , 2010, NIPS.

[17]  Shankar Vembu,et al.  Inferring clonal evolution of tumors from single nucleotide somatic mutations , 2012, BMC Bioinformatics.

[18]  E. Shapiro,et al.  Single-cell sequencing-based technologies will revolutionize whole-organism science , 2013, Nature Reviews Genetics.

[19]  Junfeng Wang,et al.  Inferring Clonal Composition from Multiple Sections of a Breast Cancer , 2014, PLoS Comput. Biol..

[20]  Adrian Corduneanu,et al.  Variational Bayesian Model Selection for Mixture Distributions , 2001 .

[21]  Sohrab P. Shah,et al.  Dynamics of genomic clones in breast cancer patient xenografts at single-cell resolution , 2014, Nature.

[22]  N. Navin Delineating cancer evolution with single-cell sequencing , 2015, Science Translational Medicine.

[23]  Michael Kerger,et al.  Tracking the origins and drivers of subclonal metastatic expansion in prostate cancer , 2015, Nature Communications.

[24]  Nevenka Dimitrova,et al.  Optimizing sparse sequencing of single cells for highly multiplex copy number profiling , 2015, Genome research.

[25]  N. Navin,et al.  Clonal Evolution in Breast Cancer Revealed by Single Nucleus Genome Sequencing , 2014, Nature.

[26]  P. Deb Finite Mixture Models , 2008 .

[27]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[28]  Niko Beerenwinkel,et al.  BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies , 2015, Genome Biology.

[29]  David M. Blei,et al.  Structured Stochastic Variational Inference , 2014, 1404.4114.