Learning mutational graphs of individual tumor evolution from multi-sample sequencing data

Background A large number of algorithms is being developed to reconstruct evolutionary models of individual tumours from genome sequencing data. Most methods can analyze multiple samples collected either through bulk multi-region sequencing experiments or the sequencing of individual cancer cells. However, rarely the same method can support both data types. Results We introduce TRaIT, a computational framework to infer mutational graphs that model the accumulation of multiple types of somatic alterations driving tumour evolution. Compared to other tools, TRaIT supports multi-region and single-cell sequencing data within the same statistical framework, and delivers expressive models that capture many complex evolutionary phenomena. TRaIT improves accuracy, robustness to data-specific errors and computational complexity compared to competing methods. Conclusions We show that the application of TRaIT to single-cell and multi-region cancer datasets can produce accurate and reliable models of single-tumour evolution, quantify the extent of intra-tumour heterogeneity and generate new testable experimental hypotheses.

[1]  A. Bouchard-Côté,et al.  PyClone: statistical inference of clonal population structure in cancer , 2014, Nature Methods.

[2]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[3]  David Basanta,et al.  Exploiting ecological principles to better understand cancer progression and treatment , 2013, Interface Focus.

[4]  Nancy R. Zhang,et al.  Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing , 2016, Proceedings of the National Academy of Sciences.

[5]  Ken Chen,et al.  Genotyping tumor clones from single-cell data , 2016, Nature Methods.

[6]  H. Luo,et al.  Colorectal Cancer Genetic Heterogeneity Delineated by Multi-Region Sequencing , 2016, PloS one.

[7]  Richard Simon,et al.  Using single cell sequencing data to model the evolutionary history of a tumor , 2014, BMC Bioinformatics.

[8]  Jukka-Pekka Mecklin,et al.  SMAD4 as a Prognostic Marker in Colorectal Cancer , 2005, Clinical Cancer Research.

[9]  Florian Markowetz,et al.  OncoNEM: inferring tumor evolution from single-cell sequencing data , 2016, Genome Biology.

[10]  N. Navin,et al.  Clonal Evolution in Breast Cancer Revealed by Single Nucleus Genome Sequencing , 2014, Nature.

[11]  R. Prim Shortest connection networks and some generalizations , 1957 .

[12]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[13]  Benjamin J. Raphael,et al.  Reconstruction of clonal trees and tumor composition from multi-sample sequencing data , 2015, Bioinform..

[14]  P. Suppes A Probabilistic Theory Of Causality , 1970 .

[15]  W. Koh,et al.  Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics , 2014, Proceedings of the National Academy of Sciences.

[16]  Jing Ma,et al.  Roles of VEGF-C and Smad4 in the Lymphangiogenesis, Lymphatic Metastasis, and Prognosis in Colon Cancer , 2011, Journal of Gastrointestinal Surgery.

[17]  D. Kong,et al.  Androgen receptor splice variants contribute to prostate cancer aggressiveness through induction of EMT and expression of stem cell marker genes , 2015, The Prostate.

[18]  Niko Beerenwinkel,et al.  BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies , 2015, Genome Biology.

[19]  Giancarlo Mauri,et al.  Parallel implementation of efficient search schemes for the inference of cancer progression models , 2016, 2016 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[20]  Dae Hyun Kim,et al.  GNAQmutation in a patient with metastatic mucosal melanoma , 2014, BMC Cancer.

[21]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[22]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.

[23]  A. Børresen-Dale,et al.  The Life History of 21 Breast Cancers , 2012, Cell.

[24]  Harold N. Gabow,et al.  Path-based depth-first search for strong and biconnected components , 2000, Inf. Process. Lett..

[25]  A. Schäffer,et al.  The evolution of tumour phylogenetics: principles and practice , 2017, Nature Reviews Genetics.

[26]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[27]  Christina Curtis,et al.  Inferring Tumor Phylogenies from Multi-region Sequencing. , 2016, Cell systems.

[28]  Giancarlo Mauri,et al.  Inferring Tree Causal Models of Cancer Progression with Probability Raising , 2013, bioRxiv.

[29]  Huanming Yang,et al.  Single-cell sequencing analysis characterizes common and cell-lineage-specific mutations in a muscle-invasive bladder cancer , 2012, GigaScience.

[30]  Vladimir Vacic,et al.  Comparative sequencing analysis reveals high genomic concordance between matched primary and metastatic colorectal cancer lesions , 2014, Genome Biology.

[31]  Y. Kluger,et al.  TrAp: a tree approach for fingerprinting subclonal tumor composition , 2013, Nucleic acids research.

[32]  Giancarlo Mauri,et al.  Design of the TRONCO BioConductor Package for TRanslational ONCOlogy , 2016, R J..

[33]  R. Gillies,et al.  Evolutionary dynamics of carcinogenesis and why targeted therapy does not work , 2012, Nature Reviews Cancer.

[34]  Z. Szallasi,et al.  Spatial and temporal diversity in genomic instability processes defines lung cancer evolution , 2014, Science.

[35]  Junfeng Wang,et al.  Inferring Clonal Composition from Multiple Sections of a Breast Cancer , 2014, PLoS Comput. Biol..

[36]  P. Nelson,et al.  PPP2R2C Loss Promotes Castration-Resistance and Is Associated with Increased Prostate Cancer-Specific Mortality , 2013, Molecular Cancer Research.

[37]  A. Sivachenko,et al.  Punctuated Evolution of Prostate Cancer Genomes , 2013, Cell.

[38]  Huanming Yang,et al.  Single-Cell Exome Sequencing and Monoclonal Evolution of a JAK2-Negative Myeloproliferative Neoplasm , 2012, Cell.

[39]  Barbara L Parsons,et al.  Many different tumor types have polyclonal tumor origin: evidence and implications. , 2008, Mutation research.

[40]  Joshua F. McMichael,et al.  Clonal evolution in relapsed acute myeloid leukemia revealed by whole genome sequencing , 2011, Nature.

[41]  Huanming Yang,et al.  Single-Cell Exome Sequencing Reveals Single-Nucleotide Mutation Characteristics of a Kidney Tumor , 2012, Cell.

[42]  Ken Chen,et al.  SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models , 2017, Genome Biology.

[43]  P. A. Futreal,et al.  Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. , 2012, The New England journal of medicine.

[44]  Giancarlo Mauri,et al.  CAPRI: Efficient Inference of Cancer Progression Models from Cross-sectional Data , 2014, bioRxiv.

[45]  Alexander Davis,et al.  Computing tumor trees from single cells , 2016, Genome Biology.

[46]  Daniel G. Brown,et al.  Integer Programming Formulations and Computations Solving Phylogenetic and Population Genetic Problems with Missing or Genotypic Data , 2007, COCOON.

[47]  Robert E. Tarjan,et al.  Finding optimum branchings , 1977, Networks.

[48]  Alexandre Bouchard-Côté,et al.  ddClone: joint statistical inference of clonal populations from single cell and bulk tumour sequencing data , 2017, Genome Biology.

[49]  Stef van Buuren,et al.  Flexible Imputation of Missing Data , 2012 .

[50]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[51]  Benjamin J. Raphael,et al.  Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures. , 2016, Cell systems.

[52]  Yu Cao,et al.  Intratumor heterogeneity in localized lung adenocarcinomas delineated by multiregion sequencing , 2014, Science.

[53]  Daniele Ramazzotti,et al.  Modeling Cumulative Biological Phenomena with Suppes-Bayes Causal Networks , 2016, bioRxiv.

[54]  J. Troge,et al.  Tumour evolution inferred by single-cell sequencing , 2011, Nature.

[55]  Andrew Menzies,et al.  Subclonal diversification of primary breast cancer revealed by multiregion sequencing , 2015, Nature Medicine.

[56]  N. Navin,et al.  The first five years of single-cell cancer genomics and beyond , 2015, Genome research.

[57]  Alexandre Bouchard-Côté,et al.  Clonal genotype and population structure inference from single-cell tumor sequencing , 2016, Nature Methods.

[58]  Benjamin J. Raphael,et al.  Inferring Intra-tumor Heterogeneity from High-Throughput DNA Sequencing Data , 2013, RECOMB.

[59]  James D. Brenton,et al.  Phylogenetic Quantification of Intra-tumour Heterogeneity , 2013, PLoS Comput. Biol..

[60]  Ashwini Naik,et al.  Phylogenetic ctDNA analysis depicts early stage lung cancer evolution , 2017, Nature.

[61]  Zoltan Szallasi,et al.  Deterministic Evolutionary Trajectories Influence Primary Tumor Growth: TRACERx Renal , 2018, Cell.

[62]  F. Markowetz,et al.  Cancer Evolution: Mathematical Models and Computational Inference , 2014, Systematic biology.

[63]  V. P. Collins,et al.  Intratumor heterogeneity in human glioblastoma reflects cancer evolutionary dynamics , 2013, Proceedings of the National Academy of Sciences.

[64]  G. Mayhew,et al.  Tracking Cancer Evolution Reveals Constrained Routes to Metastases: TRACERx Renal , 2018, Cell.

[65]  Francesco Bonchi,et al.  Exposing the probabilistic causal structure of discrimination , 2015, International Journal of Data Science and Analytics.

[66]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[67]  Shankar Vembu,et al.  Inferring clonal evolution of tumors from single nucleotide somatic mutations , 2012, BMC Bioinformatics.

[68]  Giancarlo Mauri,et al.  TRONCO: an R package for the inference of cancer progression models from heterogeneous genomic data , 2015, bioRxiv.

[69]  Giancarlo Mauri,et al.  Design of the TRONCO BioConductor Package for TRanslational ONCOlogy , 2015 .

[70]  Giancarlo Mauri,et al.  Algorithmic methods to infer the evolutionary trajectories in cancer progression , 2015, Proceedings of the National Academy of Sciences.

[71]  Iman Hajirasouliha,et al.  Fast and scalable inference of multi-sample cancer lineages , 2014, Genome Biology.

[72]  Giancarlo Mauri,et al.  CAPRI: Efficient Inference of Cancer Progression Models from Cross-sectional Data , 2015 .

[73]  Daniele Ramazzotti,et al.  A Model of Selective Advantage for the Efficient Inference of Cancer Clonal Evolution , 2016, ArXiv.

[74]  L. Jeng,et al.  Hepatic androgen receptor suppresses hepatocellular carcinoma metastasis through modulation of cell migration and anoikis , 2012, Hepatology.

[75]  Tudor I. Oprea,et al.  Ligand-directed targeting of lymphatic vessels uncovers mechanistic insights in melanoma metastasis , 2015, Proceedings of the National Academy of Sciences.

[76]  Giulio Caravagna,et al.  Detecting repeated cancer evolution from multi-region tumor sequencing data , 2018, Nature Methods.

[77]  Nicolai J. Birkbak,et al.  Tracking the Evolution of Non‐Small‐Cell Lung Cancer , 2017, The New England journal of medicine.

[78]  N. McGranahan,et al.  The causes and consequences of genetic heterogeneity in cancer evolution , 2013, Nature.

[79]  N. Navin Cancer genomics: one cell at a time , 2014, Genome Biology.