Deconvolution and phylogeny inference of structural variations in tumor genomic samples

Motivation Phylogenetic reconstruction of tumor evolution has emerged as a crucial tool for making sense of the complexity of emerging cancer genomic datasets. Despite the growing use of phylogenetics in cancer studies, though, the field has only slowly adapted to many ways that tumor evolution differs from classic species evolution. One crucial question in that regard is how to handle inference of structural variations (SVs), which are a major mechanism of evolution in cancers but have been largely neglected in tumor phylogenetics to date, in part due to the challenges of reliably detecting and typing SVs and interpreting them phylogenetically. Results We present a novel method for reconstructing evolutionary trajectories of SVs from bulk whole‐genome sequence data via joint deconvolution and phylogenetics, to infer clonal sub‐populations and reconstruct their ancestry. We establish a novel likelihood model for joint deconvolution and phylogenetic inference on bulk SV data and formulate an associated optimization algorithm. We demonstrate the approach to be efficient and accurate for realistic scenarios of SV mutation on simulated data. Application to breast cancer genomic data from The Cancer Genome Atlas shows it to be practical and effective at reconstructing features of SV‐driven evolution in single tumors. Availability and implementation Python source code and associated documentation are available at https://github.com/jaebird123/tusv.

[1]  S. Gabriel,et al.  Pan-cancer patterns of somatic copy-number alteration , 2013, Nature Genetics.

[2]  Aleix Prat Aparicio Comprehensive molecular portraits of human breast tumours , 2012 .

[3]  P. Nowell The clonal evolution of tumor cell populations. , 1976, Science.

[4]  Naveena Singh,et al.  The clonal evolution of metastases from primary serous epithelial ovarian cancers , 2009, International journal of cancer.

[5]  Benjamin J. Raphael,et al.  The Copy-Number Tree Mixture Deconvolution Problem and Applications to Multi-sample Bulk Sequencing Tumor Data , 2017, RECOMB.

[6]  F. Markowetz,et al.  Cancer Evolution: Mathematical Models and Computational Inference , 2014, Systematic biology.

[7]  James D. Brenton,et al.  Phylogenetic Quantification of Intra-tumour Heterogeneity , 2013, PLoS Comput. Biol..

[8]  Florian Markowetz,et al.  OncoNEM: inferring tumor evolution from single-cell sequencing data , 2016, Genome Biology.

[9]  Russell Schwartz,et al.  Inferring models of multiscale copy number evolution for single-tumor phylogenetics , 2015, Bioinform..

[10]  Ken Chen,et al.  SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models , 2017, Genome Biology.

[11]  K. Polyak,et al.  Tumor heterogeneity: causes and consequences. , 2010, Biochimica et biophysica acta.

[12]  Feng Jiang,et al.  Inferring Tree Models for Oncogenesis from Comparative Genome Hybridization Data , 1999, J. Comput. Biol..

[13]  Iman Hajirasouliha,et al.  Fast and scalable inference of multi-sample cancer lineages , 2014, Genome Biology.

[14]  Ron Shamir,et al.  Complexity and algorithms for copy-number evolution problems , 2017, Algorithms for Molecular Biology.

[15]  Benjamin J. Raphael,et al.  Reconstruction of clonal trees and tumor composition from multi-sample sequencing data , 2015, Bioinform..

[16]  Nilgun Donmez,et al.  Clonality inference in multiple tumor samples using phylogeny , 2015, Bioinform..

[17]  Russell Schwartz,et al.  Algorithms to Model Single Gene, Single Chromosome, and Whole Genome Copy Number Changes Jointly in Tumor Phylogenetics , 2014, PLoS Comput. Biol..

[18]  Benjamin J. Raphael,et al.  Quantifying tumor heterogeneity in whole-genome and whole-exome sequencing data , 2014, Bioinform..

[19]  A. Bouchard-Côté,et al.  PyClone: statistical inference of clonal population structure in cancer , 2014, Nature Methods.

[20]  M. Stratton,et al.  Mutational signatures: the patterns of somatic mutations hidden in cancer genomes , 2014, Current opinion in genetics & development.

[21]  Carissa A. Sanchez,et al.  Genetic clonal diversity predicts progression to esophageal adenocarcinoma , 2006, Nature Genetics.

[22]  Russell Schwartz,et al.  Robust unmixing of tumor states in array comparative genomic hybridization data , 2010, Bioinform..

[23]  Russell Schwartz,et al.  Reconstructing Tumor phylogenies from Heterogeneous Single-Cell Data , 2007, J. Bioinform. Comput. Biol..

[24]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[25]  Benjamin J. Raphael,et al.  Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures. , 2016, Cell systems.

[26]  Nicolai J. Birkbak,et al.  Clonal status of actionable driver events and the timing of mutational processes in cancer evolution , 2015, Science Translational Medicine.

[27]  Russell Schwartz,et al.  Applying unmixing to gene expression data for tumor phylogeny inference , 2010, BMC Bioinformatics.

[28]  A. Schäffer,et al.  The evolution of tumour phylogenetics: principles and practice , 2017, Nature Reviews Genetics.

[29]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[30]  B. Vogelstein,et al.  A genetic model for colorectal tumorigenesis , 1990, Cell.

[31]  Russell Schwartz,et al.  Phylogenetic analysis of multiprobe fluorescence in situ hybridization data from tumor cell populations , 2013, Bioinform..

[32]  Jian Ma,et al.  Allele-Specific Quantification of Structural Variations in Cancer Genomes , 2016, bioRxiv.

[33]  Nancy R. Zhang,et al.  Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing , 2016, Proceedings of the National Academy of Sciences.