HyperTraPS: Inferring Probabilistic Patterns of Trait Acquisition in Evolutionary and Disease Progression Pathways.

The explosion of data throughout the biomedical sciences provides unprecedented opportunities to learn about the dynamics of evolution and disease progression, but harnessing these large and diverse datasets remains challenging. Here, we describe a highly generalizable statistical platform to infer the dynamic pathways by which many, potentially interacting, traits are acquired or lost over time. We use HyperTraPS (hypercubic transition path sampling) to efficiently learn progression pathways from cross-sectional, longitudinal, or phylogenetically linked data, readily distinguishing multiple competing pathways, and identifying the most parsimonious mechanisms underlying given observations. This Bayesian approach allows inclusion of prior knowledge, quantifies uncertainty in pathway structure, and allows predictions, such as which symptom a patient will acquire next. We provide visualization tools for intuitive assessment of multiple, variable pathways. We apply the method to ovarian cancer progression and the evolution of multidrug resistance in tuberculosis, demonstrating its power to reveal previously undetected dynamic pathways.

[1]  Ken Chen,et al.  Computational approaches for inferring tumor evolution from single-cell genomic data , 2018 .

[2]  G. Carlsson,et al.  Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival , 2011, Proceedings of the National Academy of Sciences.

[3]  Giulio Caravagna,et al.  Learning mutational graphs of individual tumour evolution from single-cell and multi-region sequencing data , 2017, BMC Bioinformatics.

[4]  A. Schäffer,et al.  The evolution of tumour phylogenetics: principles and practice , 2017, Nature Reviews Genetics.

[5]  Giulio Caravagna,et al.  Learning mutational graphs of individual tumor evolution from multi-sample sequencing data , 2017 .

[6]  Richard Simon,et al.  Estimating the order of mutations during tumorigenesis from tumor genome sequencing data , 2012, Bioinform..

[7]  K. Sirotkin,et al.  The interactive online SKY/M‐FISH & CGH Database and the Entrez Cancer Chromosomes search database: Linkage of chromosomal aberrations with the genome sequence , 2005, Genes, chromosomes & cancer.

[8]  R. Díaz-Uriarte Cancer progression models and fitness landscapes: a many-to-many relationship , 2017, bioRxiv.

[9]  Ken Chen,et al.  SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models , 2017, Genome Biology.

[10]  K. Boucher,et al.  Estimating an oncogenetic tree when false negatives and positives are present. , 2002, Mathematical biosciences.

[11]  Iain G. Johnston,et al.  Toward Precision Healthcare: Context and Mathematical Challenges , 2017, Front. Physiol..

[12]  Jukka Corander,et al.  Evolution and transmission of drug resistant tuberculosis in a Russian population , 2014, Nature Genetics.

[13]  Giulio Caravagna,et al.  Detecting repeated cancer evolution from multi-region tumor sequencing data , 2018, Nature Methods.

[14]  Jack Kuipers,et al.  Tree inference for single-cell data , 2016 .

[15]  M. Pagel,et al.  Bayesian Analysis of Correlated Evolution of Discrete Characters by Reversible‐Jump Markov Chain Monte Carlo , 2006, The American Naturalist.

[16]  B. O’Meara Evolutionary Inferences from Phylogenies: A Review of Methods , 2012 .

[17]  Jonathan P. Bollback,et al.  SIMMAP: Stochastic character mapping of discrete traits on phylogenies , 2006, BMC Bioinformatics.

[18]  C. Andrieu,et al.  The pseudo-marginal approach for efficient Monte Carlo computations , 2009, 0903.5480.

[19]  M. Roizen,et al.  Hallmarks of Cancer: The Next Generation , 2012 .

[20]  Giancarlo Mauri,et al.  Inferring Tree Causal Models of Cancer Progression with Probability Raising , 2013, bioRxiv.

[21]  Giancarlo Mauri,et al.  CAPRI: Efficient Inference of Cancer Progression Models from Cross-sectional Data , 2014, bioRxiv.

[22]  Iain G. Johnston,et al.  Phenotypic landscape inference reveals multiple evolutionary paths to C4 photosynthesis , 2013, eLife.

[23]  Seth Sullivant,et al.  Markov models for accumulating mutations , 2007, 0709.2646.

[24]  Jack Kuipers,et al.  Large-scale inference of conjunctive Bayesian networks , 2016, Bioinform..

[25]  Simon J. Greenhill,et al.  Broad supernatural punishment but not moralizing high gods precede the evolution of political complexity in Austronesia , 2015, Proceedings of the Royal Society B: Biological Sciences.

[26]  Giancarlo Mauri,et al.  Efficient inference of cancer progression models , 2014 .

[27]  Giancarlo Mauri,et al.  TRONCO: an R package for the inference of cancer progression models from heterogeneous genomic data , 2015 .

[28]  J. Losos,et al.  ECOLOGICAL OPPORTUNITY AND THE RATE OF MORPHOLOGICAL EVOLUTION IN THE DIVERSIFICATION OF GREATER ANTILLEAN ANOLES , 2010, Evolution; international journal of organic evolution.

[29]  Florian Markowetz,et al.  OncoNEM: inferring tumor evolution from single-cell sequencing data , 2016, Genome Biology.

[30]  Kieran R. Campbell,et al.  Order under uncertainty: robust differential expression analysis using probabilistic models for pseudotime inference , 2016 .

[31]  Niko Beerenwinkel,et al.  Quantifying cancer progression with conjunctive Bayesian networks , 2009, Bioinform..

[32]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[33]  R. O’Hara,et al.  A review of Bayesian variable selection methods: what, how and which , 2009 .

[34]  F. Markowetz,et al.  Cancer Evolution: Mathematical Models and Computational Inference , 2014, Systematic biology.

[35]  Jens Lagergren,et al.  New Probabilistic Network Models and Algorithms for Oncogenesis , 2006, J. Comput. Biol..

[36]  Nicholas Eriksson,et al.  Conjunctive Bayesian networks , 2006, math/0608417.

[37]  I. Johnston,et al.  Evolutionary Inference across Eukaryotes Identifies Specific Pressures Favoring Mitochondrial Gene Retention. , 2016, Cell systems.

[38]  D. Hanahan,et al.  The Hallmarks of Cancer , 2000, Cell.

[39]  J. Rosenthal,et al.  On the efficiency of pseudo-marginal random walk Metropolis algorithms , 2013, The Annals of Statistics.

[40]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[41]  Iain Murray,et al.  Pseudo-Marginal Slice Sampling , 2015, AISTATS.

[42]  Feng Jiang,et al.  Inferring Tree Models for Oncogenesis from Comparative Genome Hybridization Data , 1999, J. Comput. Biol..

[43]  Nicholas Eriksson,et al.  The Temporal Order of Genetic and Pathway Alterations in Tumorigenesis , 2011, PloS one.