From word models to executable models of signaling networks using automated assembly

Word models (natural language descriptions of molecular mechanisms) are a common currency in spoken and written communication in biomedicine but are of limited use in predicting the behavior of complex biological networks. We present an approach to building computational models directly from natural language using automated assembly. Molecular mechanisms described in simple English are read by natural language processing algorithms, converted into an intermediate representation and assembled into executable or network models. We have implemented this approach in the Integrated Network and Dynamical Reasoning Assembler (INDRA), which draws on existing natural language processing systems as well as pathway information in Pathway Commons and other online resources. We demonstrate the use of INDRA and natural language to model three biological processes of increasing scope: (i) p53 dynamics in response to DNA damage; (ii) adaptive drug resistance in BRAF-V600E mutant melanomas; and (iii) the RAS signaling pathway. The use of natural language for modeling makes routine tasks more efficient for modeling practitioners and increases the accessibility and transparency of models for the broader biology community. Standfirst text INDRA uses natural language processing systems to read descriptions of molecular mechanisms and assembles them into executable models. Highlights INDRA decouples the curation of knowledge as word models from model implementation INDRA is connected to multiple natural language processing systems and can draw on information from curated databases INDRA can assemble dynamical models in rule-based and reaction network formalisms, as well as Boolean networks and visualization formats We used INDRA to build models of p53 dynamics, resistance to targeted inhibitors of BRAF in melanoma, and the Ras signaling pathway from natural language

[1]  Chao Zhang,et al.  RAF inhibitors transactivate RAF dimers and ERK signaling in cells with wild-type BRAF , 2010, Nature.

[2]  Vincent Danos,et al.  Intrinsic information carriers in combinatorial dynamical systems. , 2010, Chaos.

[3]  Vincent Danos,et al.  Internal coarse-graining of molecular systems , 2009, Proceedings of the National Academy of Sciences.

[4]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[5]  M. Antonyak,et al.  Constitutive Activation of c-Jun N-terminal Kinase by a Mutant Epidermal Growth Factor Receptor* , 1998, The Journal of Biological Chemistry.

[6]  J C Schaff,et al.  Integrating BioPAX pathway knowledge with SBML models. , 2009, IET systems biology.

[7]  Hugh D. Spence,et al.  Minimum information requested in the annotation of biochemical models (MIRIAM) , 2005, Nature Biotechnology.

[8]  Edward C Stites,et al.  Network Analysis of Oncogenic Ras Activation in Cancer , 2007, Science.

[9]  Terrence J. Sejnowski,et al.  Multi-state Modeling of Biomolecules , 2014, PLoS Comput. Biol..

[10]  Jochen H M Prehn,et al.  Systems analysis of BCL2 protein family interactions establishes a model to predict responses to chemotherapy. , 2013, Cancer research.

[11]  William W. Chen,et al.  Classic and contemporary approaches to modeling biochemical reactions. , 2010, Genes & development.

[12]  Gary D Bader,et al.  BioPAX – A community standard for pathway data sharing , 2010, Nature Biotechnology.

[13]  Sarala M. Wimalaratne,et al.  The Systems Biology Graphical Notation , 2009, Nature Biotechnology.

[14]  Vincent Danos,et al.  Transactions on Computational Systems Biology XI , 2009 .

[15]  D. Esposito,et al.  Dragging ras back in the ring. , 2014, Cancer cell.

[16]  Nathanael Chambers,et al.  A Dialogue-Based Approach to Multi-Robot Team Control , 2005 .

[17]  Chris Sander,et al.  ChiBE: interactive visualization and manipulation of BioPAX pathway models , 2010, Bioinform..

[18]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[19]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[20]  Emden R. Gansner,et al.  Graphviz - Open Source Graph Drawing Tools , 2001, GD.

[21]  Jonathan R. Karr,et al.  A Whole-Cell Computational Model Predicts Phenotype from Genotype , 2012, Cell.

[22]  Jing Chen,et al.  NDEx, the Network Data Exchange. , 2015, Cell systems.

[23]  Galit Lahav,et al.  Stimulus-dependent dynamics of p53 in single cells , 2011, Molecular systems biology.

[24]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[25]  Sarah M. Keating,et al.  BioModels: Content, Features, Functionality, and Use , 2015, CPT: pharmacometrics & systems pharmacology.

[26]  Ursula Klingmüller,et al.  Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood , 2009, Bioinform..

[27]  Jeremy Gunawardena,et al.  Models in biology: ‘accurate descriptions of our pathetic thinking’ , 2014, BMC Biology.

[28]  Sb Ras,et al.  BioUML: VISUAL MODELING, AUTOMATED CODE GENERATION AND SIMULATION OF BIOLOGICAL SYSTEMS , 2006 .

[29]  Mudita Singhal,et al.  COPASI - a COmplex PAthway SImulator , 2006, Bioinform..

[30]  M L Blinov,et al.  Combinatorial complexity and dynamical restriction of network flows in signal transduction. , 2004, Systems biology.

[31]  Mihai Surdeanu,et al.  A Domain-independent Rule-based Framework for Event Extraction , 2015, ACL.

[32]  E. Wagner,et al.  Signal integration by JNK and p38 MAPK pathways in cancer development , 2009, Nature Reviews Cancer.

[33]  P. Ascierto,et al.  Combined vemurafenib and cobimetinib in BRAF-mutated melanoma. , 2014, The New England journal of medicine.

[34]  Jeremy Gunawardena,et al.  Time‐scale separation – Michaelis and Menten's old idea, still bearing fruit , 2014, The FEBS journal.

[35]  M. Kastan,et al.  DNA damage activates ATM through intermolecular autophosphorylation and dimer dissociation , 2003, Nature.

[36]  Carlos F. Lopez,et al.  Programming biological models in Python using PySB , 2013, Molecular systems biology.

[37]  Walter Kolch,et al.  Signaling pathway models as biomarkers: Patient-specific simulations of JNK activity predict the survival of neuroblastoma patients , 2015, Science Signaling.

[38]  Michael L. Creech,et al.  Integration of biological networks and gene expression data using Cytoscape , 2007, Nature Protocols.

[39]  Andreas Zell,et al.  Precise generation of systems biology models from KEGG pathways , 2013, BMC Systems Biology.

[40]  Markus Rehm,et al.  Single-cell Fluorescence Resonance Energy Transfer Analysis Demonstrates That Caspase Activation during Apoptosis Is a Rapid Process , 2002, The Journal of Biological Chemistry.

[41]  T. Ideker,et al.  Siri of the Cell: What Biology Could Learn from the iPhone , 2014, Cell.

[42]  Gordon D. Plotkin,et al.  A Language for Biochemical Systems , 2008, CMSB.

[43]  B. Kholodenko,et al.  Ligand-dependent responses of the ErbB signaling network: experimental and modeling analyses , 2007, Molecular systems biology.

[44]  J. Tyson,et al.  Design principles of biochemical oscillators , 2008, Nature Reviews Molecular Cell Biology.

[45]  Ugur Dogrusoz,et al.  SBGNViz: A Tool for Visualization and Complexity Management of SBGN Process Description Maps , 2015, PloS one.

[46]  Vincent Danos,et al.  Scalable Simulation of Cellular Signaling Networks , 2007, APLAS.

[47]  Seong-tae Kim,et al.  WIP1, a homeostatic regulator of the DNA damage response, is targeted by HIPK2 for phosphorylation and degradation. , 2013, Molecular cell.

[48]  Luca Cardelli,et al.  An Intuitive Automated Modelling Interface for Systems Biology , 2009, DCM.

[49]  James F. Allen,et al.  TRIPS: An Integrated Intelligent Problem-Solving Assistant , 1998, AAAI/IAAI.

[50]  Nicolas Le Novère,et al.  Identifiers.org and MIRIAM Registry: community resources to provide persistent identification , 2011, Nucleic Acids Res..

[51]  R. Murphy,et al.  Regulation of the p14ARF-Mdm2-p53 pathway: an overview in breast cancer. , 2006, Experimental and molecular pathology.

[52]  Song Li,et al.  Boolean network simulations for life scientists , 2008, Source Code for Biology and Medicine.

[53]  Jun'ichi Tsujii,et al.  Adapting a Probabilistic Disambiguation Model of an HPSG Parser to a New Domain , 2005, IJCNLP.

[54]  D. Lauffenburger,et al.  Quantitative analysis of pathways controlling extrinsic apoptosis in single cells. , 2008, Molecular cell.

[55]  A. Hoffmann,et al.  The IkappaB-NF-kappaB signaling module: temporal control and selective gene activation. , 2002, Science.

[56]  O. Rath,et al.  MAP kinase signalling pathways in cancer , 2007, Oncogene.

[57]  A. Hoffmann,et al.  The I (cid:1) B –NF-(cid:1) B Signaling Module: Temporal Control and Selective Gene Activation , 2022 .

[58]  Chris Sander,et al.  Pattern search in BioPAX models , 2013, Bioinform..

[59]  G. Lahav,et al.  Encoding and Decoding Cellular Information through Signaling Dynamics , 2013, Cell.

[60]  Carmen G. Moles,et al.  Parameter estimation in biochemical pathways: a comparison of global optimization methods. , 2003, Genome research.

[61]  Andreas Zell,et al.  Path2Models: large-scale generation of computational models from biochemical pathway maps , 2013, BMC Systems Biology.

[62]  Katherine C. Chen,et al.  Integrative analysis of cell cycle control in budding yeast. , 2004, Molecular biology of the cell.

[63]  Chris J. Myers,et al.  Toward community standards and software for whole-cell modeling , 2016, IEEE Transactions on Biomedical Engineering.

[64]  Russell Harmer,et al.  A knowledge representation meta-model for rule-based modelling of signalling networks , 2016, DCM.

[65]  Ultan McDermott,et al.  Elevated CRAF as a potential mechanism of acquired resistance to BRAF inhibition in melanoma. , 2008, Cancer research.

[66]  Alfonso Valencia,et al.  How to link ontologies and protein–protein interactions to literature: text-mining approaches and the BioCreative experience , 2012, Database J. Biol. Databases Curation.

[67]  Mehdi Manshadi,et al.  Toward a Universal Underspecified Semantic Representation , 2008 .

[68]  Natalie L. Catlett,et al.  Reverse causal reasoning: applying qualitative causal knowledge to the interpretation of high-throughput data , 2013, BMC Bioinformatics.

[69]  T. Pawson,et al.  SH2/SH3 Adaptor Proteins Can Link Tyrosine Kinases to a Ste20-related Protein Kinase, HPK1* , 1997, The Journal of Biological Chemistry.

[70]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[71]  Hiroaki Kitano,et al.  A framework for mapping, visualisation and automatic model creation of signal-transduction networks , 2012, Molecular systems biology.

[72]  Jeremy L. Muhlich,et al.  Properties of cell death models calibrated and compared using Bayesian approaches , 2013, Molecular systems biology.

[73]  O. Abdel-Wahab,et al.  BRAF Mutants Evade ERK-Dependent Feedback by Different Mechanisms that Determine Their Sensitivity to Pharmacologic Inhibition. , 2015, Cancer cell.

[74]  Jacek Blazewicz,et al.  ModeLang: A New Approach for Experts-Friendly Viral Infections Modeling , 2013, Comput. Math. Methods Medicine.

[75]  Björn Persson,et al.  Faunus: An object oriented framework for molecular simulation , 2008, Source Code for Biology and Medicine.

[76]  S. Nelson,et al.  Melanoma whole exome sequencing identifies V600EB-RAF amplification-mediated acquired B-RAF inhibitor resistance , 2012, Nature Communications.

[77]  J. Glover,et al.  ATR autophosphorylation as a molecular switch for checkpoint activation. , 2011, Molecular cell.

[78]  R. Khandelwal,et al.  Overexpression of Akt1 upregulates glycogen synthase activity and phosphorylation of mTOR in IRS‐1 knockdown HepG2 cells , 2008, Journal of cellular biochemistry.

[79]  B. Taylor,et al.  The RAF inhibitor PLX4032 inhibits ERK signaling and tumor cell proliferation in a V600E BRAF-selective manner , 2010, Proceedings of the National Academy of Sciences.

[80]  D. Bar-Sagi,et al.  Identification of the mitogen-activated protein kinase phosphorylation sites on human Sos1 that regulate interaction with Grb2 , 1996, Molecular and cellular biology.

[81]  James R Faeder,et al.  Rule-based modeling of biochemical systems with BioNetGen. , 2009, Methods in molecular biology.

[82]  Michel Dumontier,et al.  Controlled vocabularies and semantics in systems biology , 2011, Molecular systems biology.

[83]  Nathanael Chambers,et al.  Chester: Towards a personal medication advisor , 2006, J. Biomed. Informatics.

[84]  D. Lauffenburger,et al.  Input–output behavior of ErbB signaling pathways as revealed by a mass action model trained against dynamic data , 2009, Molecular systems biology.

[85]  B. Kholodenko Drug Resistance Resulting from Kinase Dimerization Is Rationalized by Thermodynamic Factors Describing Allosteric Inhibitor Effects. , 2015, Cell reports.

[86]  Douglas B. Kell,et al.  Non-linear optimization of biochemical pathways: applications to metabolic engineering and parameter estimation , 1998, Bioinform..

[87]  Vincent Danos,et al.  Rule-Based Modelling of Cellular Signalling , 2007, CONCUR.

[88]  Kei-Hoi Cheung,et al.  Erratum: The BioPAX community standard for pathway data sharing (Nat. Biotechnol. (2010) 28 (935-942) , 2010 .

[89]  Herbert M. Sauro,et al.  Antimony: a modular model definition language , 2009, Bioinform..

[90]  Choh Man Teng,et al.  Complex Event Extraction using DRUM , 2015, BioNLP@IJCNLP.

[91]  Gary D. Bader,et al.  Using Biological Pathway Data with Paxtools , 2013, PLoS Comput. Biol..

[92]  Vincent Danos,et al.  Rule-Based Modelling and Model Perturbation , 2009, Trans. Comp. Sys. Biology.

[93]  Jeremy E. Purvis,et al.  p53 Dynamics Control Cell Fate , 2012, Science.

[94]  Jeremy Gunawardena,et al.  Programming with models: modularity and abstraction provide powerful capabilities for systems biology , 2009, Journal of The Royal Society Interface.

[95]  Andreas Zell,et al.  SBMLsqueezer 2: context-sensitive creation of kinetic equations in biochemical networks , 2015, BMC Systems Biology.

[96]  Gary D. Bader,et al.  Pathway Commons, a web resource for biological pathway data , 2010, Nucleic Acids Res..

[97]  William S. Hlavacek,et al.  BioNetFit: a fitting tool compatible with BioNetGen, NFsim and distributed computing environments , 2016, Bioinform..

[98]  M. Therrien,et al.  Inhibitors that stabilize a closed RAF kinase domain conformation induce dimerization , 2013, Nature chemical biology.

[99]  S. Chandarlapaty,et al.  Relief of profound feedback inhibition of mitogenic signaling by RAF inhibitors attenuates their activity in BRAFV600E melanomas. , 2012, Cancer cell.

[100]  Thomas Höfer,et al.  Kinetic models of phosphorylation cycles: a systematic approach using the rapid-equilibrium approximation for protein-protein interactions. , 2006, Bio Systems.

[101]  Peter K. Sorger,et al.  Conservation of protein abundance patterns reveals the regulatory architecture of the EGFR-MAPK pathway , 2016, Science Signaling.

[102]  James R Faeder,et al.  Efficient modeling, simulation and coarse-graining of biological complexity with NFsim , 2011, Nature Methods.

[103]  L. Loew,et al.  The Virtual Cell: a software environment for computational cell biology. , 2001, Trends in biotechnology.

[104]  Reinhart Heinrich,et al.  Mathematical models of protein kinase signal transduction. , 2002, Molecular cell.

[105]  N. Rosen,et al.  Tumor adaptation and resistance to RAF inhibitors , 2013, Nature Medicine.

[106]  Martin Hofmann-Apitius,et al.  Text mining for systems biology. , 2014, Drug discovery today.

[107]  N. Kikuchi,et al.  CellDesigner 3.5: A Versatile Modeling Tool for Biochemical Networks , 2008, Proceedings of the IEEE.

[108]  Tim Angus,et al.  Modelling the Structure and Dynamics of Biological Pathways , 2016, PLoS biology.

[109]  Barbara J. Grosz,et al.  Natural-Language Processing , 1982, Artificial Intelligence.

[110]  Colin W. Taylor,et al.  IP3 receptors: Take four IP3 to open , 2016, Science Signaling.

[111]  Julio Saez-Rodriguez,et al.  OmniPath: guidelines and gateway for literature-curated signaling pathway resources , 2016, Nature Methods.