MIPUP: minimum perfect unmixed phylogenies for multi-sampled tumors via branchings and ILP

Motivation: Discovering the evolution of a tumor may help identify driver mutations and provide a more comprehensive view on the history of the tumor. Recent studies have tackled this problem using multiple samples sequenced from a tumor, and due to clinical implications, this has attracted great interest. However, such samples usually mix several distinct tumor subclones, which confounds the discovery of the tumor phylogeny. Results: We study a natural problem formulation requiring to decompose the tumor samples into several subclones with the objective of forming a minimum perfect phylogeny. We propose an Integer Linear Programming formulation for it, and implement it into a method called MIPUP. We tested the ability of MIPUP and of four popular tools LICHeE, AncesTree, CITUP, Treeomics to reconstruct the tumor phylogeny. On simulated data, MIPUP shows up to a 34% improvement under the ancestor‐descendant relations metric. On four real datasets, MIPUP's reconstructions proved to be generally more faithful than those of LICHeE. Availability and implementation: MIPUP is available at https://github.com/zhero9/MIPUP as open source. Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Iman Hajirasouliha,et al.  A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data , 2014, Bioinform..

[2]  Benjamin J. Raphael,et al.  Tumor phylogeny inference using tree-constrained importance sampling , 2017, Bioinform..

[3]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[4]  Benjamin J. Raphael,et al.  Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures. , 2016, Cell systems.

[5]  A. Børresen-Dale,et al.  The Life History of 21 Breast Cancers , 2012, Cell.

[6]  Iman Hajirasouliha,et al.  Fast and scalable inference of multi-sample cancer lineages , 2014, Genome Biology.

[7]  Krishnendu Chatterjee,et al.  Reconstructing metastatic seeding patterns of human cancers , 2017, Nature Communications.

[8]  Martin Milanic,et al.  Perfect Phylogenies via Branchings in Acyclic Digraphs and a Generalization of Dilworth’s Theorem , 2017, ACM Trans. Algorithms.

[9]  Esa Pitkänen,et al.  Clonally related uterine leiomyomas are common and display branched tumor evolution. , 2015, Human molecular genetics.

[10]  Nilgun Donmez,et al.  Clonality inference in multiple tumor samples using phylogeny , 2015, Bioinform..

[11]  Alexandru I. Tomescu,et al.  Complexity and Algorithms for Finding a Perfect Phylogeny from Mixed Tumor Samples , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  Bernard Ries,et al.  Finding a Perfect Phylogeny from Mixed Tumor Samples , 2015, WABI.

[13]  Nancy R. Zhang,et al.  Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing , 2016, Proceedings of the National Academy of Sciences.

[14]  Sohrab P. Shah,et al.  Dynamics of genomic clones in breast cancer patient xenografts at single-cell resolution , 2014, Nature.

[15]  Iman Hajirasouliha,et al.  Reconstructing Mutational History in Multiply Sampled Tumors Using Perfect Phylogeny Mixtures , 2014, WABI.

[16]  Christopher A. Miller,et al.  VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. , 2012, Genome research.

[17]  Y. Kluger,et al.  TrAp: a tree approach for fingerprinting subclonal tumor composition , 2013, Nucleic acids research.

[18]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[19]  Amos Korman,et al.  The Dependent Doors Problem , 2017, ACM Trans. Algorithms.

[20]  Dan Gusfield,et al.  Efficient algorithms for inferring evolutionary trees , 1991, Networks.

[21]  Alexandru I. Tomescu,et al.  SNV-PPILP: refined SNV calling for tumor data using perfect phylogenies and ILP , 2015, Bioinform..

[22]  Russell Schwartz,et al.  Applying unmixing to gene expression data for tumor phylogeny inference , 2010, BMC Bioinformatics.

[23]  Shankar Vembu,et al.  Inferring clonal evolution of tumors from single nucleotide somatic mutations , 2012, BMC Bioinformatics.

[24]  G. Estabrook,et al.  An idealized concept of the true cladistic character , 1975 .

[25]  Benjamin J. Raphael,et al.  Reconstruction of clonal trees and tumor composition from multi-sample sequencing data , 2015, Bioinform..

[26]  M. Stratton,et al.  Subclonal phylogenetic structures in cancer revealed by ultra-deep sequencing , 2008, Proceedings of the National Academy of Sciences.

[27]  Ali Bashashati,et al.  Distinct evolutionary trajectories of primary high-grade serous ovarian cancers revealed through spatial mutational profiling , 2013, The Journal of pathology.

[28]  P. A. Futreal,et al.  Genomic architecture and evolution of clear cell renal cell carcinomas defined by multiregion sequencing , 2014, Nature Genetics.

[29]  Serafim Batzoglou,et al.  Inference of Tumor Phylogenies with Improved Somatic Mutation Discovery , 2013, RECOMB.

[30]  Serafim Batzoglou,et al.  Genome evolution during progression to breast cancer , 2013, Genome research.

[31]  Chuan He,et al.  Fate by RNA methylation: m6A steers stem cell pluripotency , 2015, Genome Biology.