BAMSE: Bayesian model selection for tumor phylogeny inference among multiple samples

BackgroundIntra-tumor heterogeneity is known to contribute to cancer complexity and drug resistance. Understanding the number of distinct subclones and the evolutionary relationships between them is scientifically and clinically very important and still a challenging problem.ResultsIn this paper, we present BAMSE (BAyesian Model Selection for tumor Evolution), a new probabilistic method for inferring subclonal history and lineage tree reconstruction of heterogeneous tumor samples. BAMSE uses somatic mutation read counts as input and can leverage multiple tumor samples accurately and efficiently. In the first step, possible clusterings of mutations into subclones are scored and a user defined number are selected for further analysis. In the next step, for each of these candidates, a list of trees describing the evolutionary relationships between the subclones is generated. These trees are sorted by their posterior probability. The posterior probability is calculated using a Bayesian model that integrates prior belief about the number of subclones, the composition of the tumor and the process of subclonal evolution. BAMSE also takes the sequencing error into account. We benchmarked BAMSE against state of the art software using simulated datasets.ConclusionsIn this work we developed a flexible and fast software to reconstruct the history of a tumor’s subclonal evolution using somatic mutation read counts across multiple samples. BAMSE software is implemented in Python and is available open source under GNU GLPv3 at https://github.com/HoseinT/BAMSE.

[1]  Benjamin J. Raphael,et al.  Reconstruction of clonal trees and tumor composition from multi-sample sequencing data , 2015, Bioinform..

[2]  Iman Hajirasouliha,et al.  Fast and scalable inference of multi-sample cancer lineages , 2014, Genome Biology.

[3]  Iman Hajirasouliha,et al.  Reconstructing Mutational History in Multiply Sampled Tumors Using Perfect Phylogeny Mixtures , 2014, WABI.

[4]  George Casella,et al.  Cluster Analysis, Model Selection, and Prior Distributions on Models , 2014 .

[5]  Benjamin J. Raphael,et al.  Tumor phylogeny inference using tree-constrained importance sampling , 2017, Bioinform..

[6]  Y. Kluger,et al.  TrAp: a tree approach for fingerprinting subclonal tumor composition , 2013, Nucleic acids research.

[7]  Obi L. Griffith,et al.  SciClone: Inferring Clonal Architecture and Tracking the Spatial and Temporal Patterns of Tumor Evolution , 2014, PLoS Comput. Biol..

[8]  P. Nowell The clonal evolution of tumor cell populations. , 1976, Science.

[9]  Shankar Vembu,et al.  Inferring clonal evolution of tumors from single nucleotide somatic mutations , 2012, BMC Bioinformatics.

[10]  A. Schäffer,et al.  The evolution of tumour phylogenetics: principles and practice , 2017, Nature Reviews Genetics.

[11]  Junfeng Wang,et al.  Inferring Clonal Composition from Multiple Sections of a Breast Cancer , 2014, PLoS Comput. Biol..

[12]  Kewei Tu,et al.  Modified Dirichlet Distribution: Allowing Negative Parameters to Induce Stronger Sparsity , 2016, EMNLP.

[13]  Stephen P. Boyd,et al.  CVXPY: A Python-Embedded Modeling Language for Convex Optimization , 2016, J. Mach. Learn. Res..

[14]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[15]  N. McGranahan,et al.  Clonal Heterogeneity and Tumor Evolution: Past, Present, and the Future , 2017, Cell.

[16]  Niko Beerenwinkel,et al.  BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies , 2015, Genome Biology.

[17]  Michael I. Jordan,et al.  Tree-Structured Stick Breaking for Hierarchical Data , 2010, NIPS.

[18]  Iman Hajirasouliha,et al.  A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data , 2014, Bioinform..

[19]  E. Mroz,et al.  Intra-tumor heterogeneity in head and neck cancer and its clinical implications , 2016, World journal of otorhinolaryngology - head and neck surgery.

[20]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[21]  F. Markowetz,et al.  Cancer Evolution: Mathematical Models and Computational Inference , 2014, Systematic biology.

[22]  Nilgun Donmez,et al.  Clonality inference in multiple tumor samples using phylogeny , 2015, Bioinform..

[23]  P. A. Futreal,et al.  Genomic architecture and evolution of clear cell renal cell carcinomas defined by multiregion sequencing , 2014, Nature Genetics.