Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures.

Phylogenetic techniques are increasingly applied to infer the somatic mutational history of a tumor from DNA sequencing data. However, standard phylogenetic tree reconstruction techniques do not account for the fact that bulk sequencing data measures mutations in a population of cells. We formulate and solve the multi-state perfect phylogeny mixture deconvolution problem of reconstructing a phylogenetic tree given mixtures of its leaves, under the multi-state perfect phylogeny, or infinite alleles model. Our somatic phylogeny reconstruction using combinatorial enumeration (SPRUCE) algorithm uses this model to construct phylogenetic trees jointly from single-nucleotide variants (SNVs) and copy-number aberrations (CNAs). We show that SPRUCE addresses complexities in simultaneous analysis of SNVs and CNAs. In particular, there are often many possible phylogenetic trees consistent with the data, but the ambiguity decreases considerably with an increasing number of samples. These findings have implications for tumor sequencing strategies, suggest caution in drawing strong conclusions based on a single tree reconstruction, and explain difficulties faced by applying existing phylogenetic techniques to tumor sequencing data.

[1]  Joshua F. McMichael,et al.  Optimizing cancer genome sequencing and analysis. , 2015, Cell systems.

[2]  L. Pusztai,et al.  Cancer heterogeneity: implications for targeted therapeutics , 2013, British Journal of Cancer.

[3]  S. Gabriel,et al.  Pan-cancer patterns of somatic copy-number alteration , 2013, Nature Genetics.

[4]  Iman Hajirasouliha,et al.  A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data , 2014, Bioinform..

[5]  P. Buneman A Note on the Metric Properties of Trees , 1974 .

[6]  Dan Gusfield,et al.  Extensions and Improvements to the Chordal Graph Approach to the Multistate Perfect Phylogeny Problem , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  Subramanian Venkatesan,et al.  Tumor Evolutionary Principles: How Intratumor Heterogeneity Influences Cancer Treatment and Outcome. , 2016, American Society of Clinical Oncology educational book. American Society of Clinical Oncology. Annual Meeting.

[8]  Benjamin J. Raphael,et al.  Reconstruction of clonal trees and tumor composition from multi-sample sequencing data , 2015, Bioinform..

[9]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[10]  C. Huttenhower,et al.  PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes , 2013, Nature Communications.

[11]  David Fernández-Baca,et al.  A Polynomial-Time Algorithm for the Perfect Phylogeny Problem when the Number of Character States is Fixed , 1993, FOCS.

[12]  Xuemei Lu,et al.  Extremely high genetic diversity in a single tumor points to prevalence of non-Darwinian cell evolution , 2015, Proceedings of the National Academy of Sciences.

[13]  C. Curtis,et al.  A Big Bang model of human colorectal tumor growth , 2015, Nature Genetics.

[14]  A. Børresen-Dale,et al.  The Life History of 21 Breast Cancers , 2012, Cell.

[15]  David Fernández-Baca,et al.  The Perfect Phylogeny Problem , 2001 .

[16]  N. Navin,et al.  Clonal Evolution in Breast Cancer Revealed by Single Nucleus Genome Sequencing , 2014, Nature.

[17]  Nicolai J. Birkbak,et al.  Clonal status of actionable driver events and the timing of mutational processes in cancer evolution , 2015, Science Translational Medicine.

[18]  Eugene W. Myers,et al.  Finding All Spanning Trees of Directed and Undirected Graphs , 1978, SIAM J. Comput..

[19]  Shankar Vembu,et al.  Inferring clonal evolution of tumors from single nucleotide somatic mutations , 2012, BMC Bioinformatics.

[20]  Y. Kluger,et al.  TrAp: a tree approach for fingerprinting subclonal tumor composition , 2013, Nucleic acids research.

[21]  Dan Gusfield,et al.  ReCombinatorics: The Algorithmics of Ancestral Recombination Graphs and Explicit Phylogenetic Networks , 2014 .

[22]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[23]  Michael R. Fellows,et al.  Two Strikes Against Perfect Phylogeny , 1992, ICALP.

[24]  Sohrab P. Shah,et al.  TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data , 2014, Genome research.

[25]  R. Arceci Intratumor Heterogeneity and Branched Evolution Revealed by Multiregion Sequencing , 2012 .

[26]  Nilgun Donmez,et al.  Clonality inference in multiple tumor samples using phylogeny , 2015, Bioinform..

[27]  Robert T. Jones,et al.  Genomic Characterization of Brain Metastases Reveals Branched Evolution and Potential Therapeutic Targets. , 2015, Cancer discovery.

[28]  Serafim Batzoglou,et al.  Genome evolution during progression to breast cancer , 2013, Genome research.

[29]  Bo Li,et al.  A general framework for analyzing tumor subclonality using SNP array and DNA sequencing data , 2014, Genome Biology.

[30]  Nam Huh,et al.  Phylogenetic analyses of melanoma reveal complex patterns of metastatic dissemination , 2015, Proceedings of the National Academy of Sciences.

[31]  Dan Gusfield,et al.  Efficient algorithms for inferring evolutionary trees , 1991, Networks.

[32]  Benjamin J. Raphael,et al.  Quantifying tumor heterogeneity in whole-genome and whole-exome sequencing data , 2014, Bioinform..

[33]  Niko Beerenwinkel,et al.  BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies , 2015, Genome Biology.

[34]  Walter M. Fitch,et al.  On the Problem of Discovering the Most Parsimonious Tree , 1977, The American Naturalist.

[35]  Matthew Meyerson,et al.  Calibrating genomic and allelic coverage bias in single-cell sequencing , 2015, Nature Communications.

[36]  Nicolai J. Birkbak,et al.  Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade , 2016, Science.

[37]  P. Nowell The clonal evolution of tumor cell populations. , 1976, Science.

[38]  Sohrab P. Shah,et al.  Dynamics of genomic clones in breast cancer patient xenografts at single-cell resolution , 2014, Nature.

[39]  M. Nykter,et al.  The Evolutionary History of Lethal Metastatic Prostate Cancer , 2015, Nature.

[40]  Sampath Kannan,et al.  A fast algorithm for the computation and enumeration of perfect phylogenies when the number of character states is fixed , 1995, SODA '95.

[41]  Russell Schwartz,et al.  Inferring models of multiscale copy number evolution for single-tumor phylogenetics , 2015, Bioinform..

[42]  Jenny Taylor,et al.  Monitoring chronic lymphocytic leukemia progression by whole genome sequencing reveals heterogeneous clonal evolution patterns. , 2012, Blood.

[43]  G. Estabrook,et al.  An idealized concept of the true cladistic character , 1975 .

[44]  Hanlee P. Ji,et al.  Pan-cancer analysis of the extent and consequences of intratumor heterogeneity , 2015, Nature Medicine.

[45]  Iman Hajirasouliha,et al.  Fast and scalable inference of multi-sample cancer lineages , 2014, Genome Biology.

[46]  G. Parmigiani,et al.  Heterogeneity of genomic evolution and mutational profiles in multiple myeloma , 2014, Nature Communications.

[47]  Dan Gusfield,et al.  The Multi-State Perfect Phylogeny Problem with Missing and Removable Data: Solutions via Integer-Programming and Chordal Graph Theory , 2009, RECOMB.