The Copy-Number Tree Mixture Deconvolution Problem and Applications to Multi-sample Bulk Sequencing Tumor Data

Cancer is an evolutionary process driven by somatic mutation. This process can be represented as a phylogenetic tree. Constructing such a phylogenetic tree from genome sequencing data is a challenging task due to the mutational complexity of cancer and the fact that nearly all cancer sequencing is of bulk tissue, measuring a superposition of somatic mutations present in different cells. We study the problem of reconstructing tumor phylogenies from copy number aberrations (CNAs) measured in bulk-sequencing data. We introduce the Copy-Number Tree Mixture Deconvolution (CNTMD) problem, which aims to find the phylogenetic tree with the fewest number of CNAs that explain the copy number data from multiple samples of a tumor. CNTMD generalizes two approaches that have been researched intensively in recent years: deconvolution/factorization algorithms that aim to infer the number and proportions of clones in a mixed tumor sample; and phylogenetic models of copy number evolution that model the dependencies between copy number events that affect the same genomic loci. We design an algorithm for solving the CNTMD problem and apply the algorithm to both simulated and real data. On simulated data, we find that our algorithm outperforms existing approaches that perform either deconvolution or phylogenetic tree construction under the assumption of a single tumor clone per sample. On real data, we analyze multiple samples from a prostate cancer patient, identifying clones within these samples and a phylogenetic tree that relates these clones and their differing proportions across samples. This phylogenetic tree provides a higher-resolution view of copy number evolution of this cancer than published analyses.

[1]  M. Nykter,et al.  The Evolutionary History of Lethal Metastatic Prostate Cancer , 2015, Nature.

[2]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[3]  Christopher J. R. Illingworth,et al.  High-Definition Reconstruction of Clonal Composition in Cancer , 2014, Cell reports.

[4]  Jian Ma,et al.  Allele-Specific Quantification of Structural Variations in Cancer Genomes , 2016, bioRxiv.

[5]  Benjamin J. Raphael,et al.  Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures. , 2016, Cell systems.

[6]  Bas E. Dutilh,et al.  Dispersion of the HIV-1 Epidemic in Men Who Have Sex with Men in the Netherlands: A Combined Mathematical Model and Phylogenetic Analysis , 2015, PLoS medicine.

[7]  Marcin J. Skwark,et al.  Improving Contact Prediction along Three Dimensions , 2014, PLoS Comput. Biol..

[8]  P. Nowell The clonal evolution of tumor cell populations. , 1976, Science.

[9]  Chuan He,et al.  Fate by RNA methylation: m6A steers stem cell pluripotency , 2015, Genome Biology.

[10]  Nancy R. Zhang,et al.  Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing , 2016, Proceedings of the National Academy of Sciences.

[11]  C. Perou,et al.  Allele-specific copy number analysis of tumors , 2010, Proceedings of the National Academy of Sciences.

[12]  A. McKenna,et al.  Absolute quantification of somatic DNA alterations in human cancer , 2012, Nature Biotechnology.

[13]  Subramanian Venkatesan,et al.  Tumor Evolutionary Principles: How Intratumor Heterogeneity Influences Cancer Treatment and Outcome. , 2016, American Society of Clinical Oncology educational book. American Society of Clinical Oncology. Annual Meeting.

[14]  James D. Brenton,et al.  Phylogenetic Quantification of Intra-tumour Heterogeneity , 2013, PLoS Comput. Biol..

[15]  Ron Shamir,et al.  A Linear-Time Algorithm for the Copy Number Transformation Problem , 2017, J. Comput. Biol..

[16]  S. Gabriel,et al.  Pan-cancer patterns of somatic copy-number alteration , 2013, Nature Genetics.

[17]  Nilgun Donmez,et al.  Clonality inference in multiple tumor samples using phylogeny , 2015, Bioinform..

[18]  Cédric Chauve,et al.  Joint Inference of Genome Structure and Content in Heterogeneous Tumor Samples , 2015, RECOMB.

[19]  W. Koh,et al.  Single-cell genome sequencing: current state of the science , 2016, Nature Reviews Genetics.

[20]  Russell Schwartz,et al.  Algorithms to Model Single Gene, Single Chromosome, and Whole Genome Copy Number Changes Jointly in Tumor Phylogenetics , 2014, PLoS Comput. Biol..

[21]  Alexander Davis,et al.  Computing tumor trees from single cells , 2016, Genome Biology.

[22]  Russell Schwartz,et al.  Medoidshift clustering applied to genomic bulk tumor data , 2016, BMC Genomics.

[23]  Sohrab P. Shah,et al.  TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data , 2014, Genome research.

[24]  A. Børresen-Dale,et al.  The Life History of 21 Breast Cancers , 2012, Cell.

[25]  P. A. Futreal,et al.  Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. , 2012, The New England journal of medicine.

[26]  K. Gunderson,et al.  Comparison of the Agilent, ROMA/NimbleGen and Illumina platforms for classification of copy number alterations in human breast tumors , 2008, BMC Genomics.

[27]  Ali Bashashati,et al.  Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer , 2016, Nature Genetics.

[28]  Benjamin J. Raphael,et al.  Reconstructing cancer genomes from paired-end sequencing data , 2012, BMC Bioinformatics.

[29]  C. Tyler-Smith,et al.  Ancient DNA and the rewriting of human history: be sparing with Occam’s razor , 2016, Genome Biology.

[30]  Benjamin J. Raphael,et al.  Reconstruction of clonal trees and tumor composition from multi-sample sequencing data , 2015, Bioinform..

[31]  Ron Shamir,et al.  Copy-Number Evolution Problems: Complexity and Algorithms , 2016, WABI.

[32]  Evis Sala,et al.  Spatial and Temporal Heterogeneity in High-Grade Serous Ovarian Cancer: A Phylogenetic Analysis , 2015, PLoS medicine.

[33]  C. Curtis,et al.  A Big Bang model of human colorectal tumor growth , 2015, Nature Genetics.

[34]  Benjamin J. Raphael,et al.  THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data , 2013, Genome Biology.