Modeling cumulative biological phenomena with Suppes-Bayes Causal Networks

Several diseases related to cell proliferation are characterized by the accumulation of somatic DNA changes, with respect to wildtype conditions. Cancer and HIV are two common examples of such diseases, where the mutational load in the cancerous/viral population increases over time. In these cases, selective pres sures are often observed along with competition, co-operation and parasitism among distinct cellular clones. Recently, we presented a mathematical framework to model these phenomena, based on a combination of Bayesian inference and Suppes’ theory of probabilistic causation, depicted in graphical structures dubbed Suppes-Bayes Causal Networks (SBCNs). SBCNs are generative probabilistic graphical models that recapitulate the potential ordering of accumulation of such DNA changes during the progression of the disease. Such models can be inferred from data by exploiting likelihood-based model-selection strategies with regularization. In this paper we discuss the theoretical foun dations of our approach and we investigate in depth the influence on the model-selection task of: (i) the poset based on Suppes’ theory and (ii) different regulariza tion strategies. Furthermore, we provide an example of application of our framework to HIV genetic data highlighting the valuable insights provided by the inferred SBCN.

[1]  C. Maley,et al.  Cancer is a disease of clonal evolution within the body1–3. This has profound clinical implications for neoplastic progression, cancer prevention and cancer therapy. Although the idea of cancer as an evolutionary problem , 2006 .

[2]  Giulio Caravagna,et al.  Inference of Cancer Progression Models with Biological Noise , 2014, ArXiv.

[3]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  Bryan Chan,et al.  Human immunodeficiency virus reverse transcriptase and protease sequence database , 2003, Nucleic Acids Res..

[6]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[7]  Francesco Bonchi,et al.  Exposing the probabilistic causal structure of discrimination , 2015, International Journal of Data Science and Analytics.

[8]  Nicholas Eriksson,et al.  Conjunctive Bayesian networks , 2006, math/0608417.

[9]  D. Hanahan,et al.  The Hallmarks of Cancer , 2000, Cell.

[10]  J. Lagergren,et al.  Learning Oncogenetic Networks by Reducing to Mixed Integer Linear Programming , 2013, PloS one.

[11]  Karin J. Metzner,et al.  A Framework for Inferring Fitness Landscapes of Patient-Derived Viruses Using Quasispecies Theory , 2014, Genetics.

[12]  M. Stone An Asymptotic Equivalence of Choice of Model by Cross‐Validation and Akaike's Criterion , 1977 .

[13]  Alison P Galvani,et al.  The role of mutation accumulation in HIV progression , 2005, Proceedings of the Royal Society B: Biological Sciences.

[14]  Daniele Ramazzotti,et al.  A Model of Selective Advantage for the Efficient Inference of Cancer Clonal Evolution , 2016, ArXiv.

[15]  Anne-Mieke Vandamme,et al.  Managing Resistance to Anti-HIV Drugs , 1999, Drugs.

[16]  F. Markowetz,et al.  Cancer Evolution: Mathematical Models and Computational Inference , 2014, Systematic biology.

[17]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[18]  Nigel F. Delaney,et al.  Darwinian Evolution Can Follow Only Very Few Mutational Paths to Fitter Proteins , 2006, Science.

[19]  Giancarlo Mauri,et al.  TRONCO: an R package for the inference of cancer progression models from heterogeneous genomic data , 2015 .

[20]  P. Bickel,et al.  Sex Bias in Graduate Admissions: Data from Berkeley , 1975, Science.

[21]  D. J. Kiviet,et al.  Empirical fitness landscapes reveal accessible evolutionary paths , 2007, Nature.

[22]  Y. Nakamura,et al.  Genetic alterations during colorectal-tumor development. , 1988, The New England journal of medicine.

[23]  A. Telenti,et al.  HIV treatment failure: testing for HIV resistance in clinical practice. , 1998, Science.

[24]  Feng Jiang,et al.  Inferring Tree Models for Oncogenesis from Comparative Genome Hybridization Data , 1999, J. Comput. Biol..

[25]  N. Navin Cancer genomics: one cell at a time , 2014, Genome Biology.

[26]  P. A. Futreal,et al.  Genomic architecture and evolution of clear cell renal cell carcinomas defined by multiregion sequencing , 2014, Nature Genetics.

[27]  Giancarlo Mauri,et al.  Algorithmic methods to infer the evolutionary trajectories in cancer progression , 2015, Proceedings of the National Academy of Sciences.

[28]  P. Nowell The clonal evolution of tumor cell populations. , 1976, Science.

[29]  P. A. Futreal,et al.  Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. , 2012, The New England journal of medicine.

[30]  Giancarlo Mauri,et al.  CAPRI: Efficient Inference of Cancer Progression Models from Cross-sectional Data , 2014, bioRxiv.

[31]  G. Yule NOTES ON THE THEORY OF ASSOCIATION OF ATTRIBUTES IN STATISTICS , 1903 .

[32]  Giancarlo Mauri,et al.  Inferring Tree Causal Models of Cancer Progression with Probability Raising , 2013, bioRxiv.

[33]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[34]  P. Suppes A Probabilistic Theory Of Causality , 1970 .

[35]  Thanat Chookajorn,et al.  Stepwise acquisition of pyrimethamine resistance in the malaria parasite , 2009, Proceedings of the National Academy of Sciences.

[36]  Peng Cui,et al.  Dynamic regulation of genome-wide pre-mRNA splicing and stress tolerance by the Sm-like protein LSm5 in Arabidopsis , 2014, Genome Biology.

[37]  E. H. Simpson,et al.  The Interpretation of Interaction in Contingency Tables , 1951 .

[38]  Niko Beerenwinkel,et al.  Quantifying cancer progression with conjunctive Bayesian networks , 2009, Bioinform..

[39]  N. Navin,et al.  Clonal Evolution in Breast Cancer Revealed by Single Nucleus Genome Sequencing , 2014, Nature.

[40]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.

[41]  N. McGranahan,et al.  The causes and consequences of genetic heterogeneity in cancer evolution , 2013, Nature.