Combining computational models, semantic annotations and simulation experiments in a graph database

Model repositories such as the BioModels Database, the CellML Model Repository or JWS Online are frequently accessed to retrieve computational models of biological systems. However, their storage concepts support only restricted types of queries and not all data inside the repositories can be retrieved. In this article we present a storage concept that meets this challenge. It grounds on a graph database, reflects the models’ structure, incorporates semantic annotations and simulation descriptions and ultimately connects different types of model-related data. The connections between heterogeneous model-related data and bio-ontologies enable efficient search via biological facts and grant access to new model features. The introduced concept notably improves the access of computational models and associated simulations in a model repository. This has positive effects on tasks such as model search, retrieval, ranking, matching and filtering. Furthermore, our work for the first time enables CellML- and Systems Biology Markup Language-encoded models to be effectively maintained in one database. We show how these models can be linked via annotations and queried. Database URL: https://sems.uni-rostock.de/projects/masymos/

[1]  Nicolas Le Novère,et al.  Ranked retrieval of Computational Biology models , 2010, BMC Bioinformatics.

[2]  Chris J. Myers,et al.  Meeting report from the fourth meeting of the Computational Modeling in Biology Network (COMBINE) , 2011, Standards in Genomic Sciences.

[3]  Lena Strömbäck,et al.  A Method for Semi-automatic Standard Integration in Systems Biology , 2008, DEXA.

[4]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[5]  Peter J. Hunter,et al.  Bioinformatics Applications Note Databases and Ontologies the Physiome Model Repository 2 , 2022 .

[6]  E. Klipp,et al.  Retrieval, alignment, and clustering of computational models based on semantic annotations , 2011, Molecular systems biology.

[7]  Bernard de Bono,et al.  The RICORDO approach to semantic interoperability for biomedical data and models: strategy, standards and solutions , 2011, BMC Research Notes.

[8]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[9]  Michael L. Hines,et al.  NeuroML: A Language for Describing Data Driven Models of Neurons and Networks with a High Degree of Biological Detail , 2010, PLoS Comput. Biol..

[10]  Jacky L. Snoep,et al.  Web-based kinetic modelling using JWS Online , 2004, Bioinform..

[11]  Nicolas Le Novère,et al.  Data Integration and Semantic Enrichment of Systems Biology Models and Simulations , 2009, DILS.

[12]  Olaf Wolkenhauer,et al.  Annotation-based feature extraction from sets of SBML models , 2014, Journal of Biomedical Semantics.

[13]  Peter J. Hunter,et al.  An Overview of CellML 1.1, a Biological Model Description Language , 2003, Simul..

[14]  Christopher J. Rawlings,et al.  Lost in translation: data integration tools meet the Semantic Web (experiences from the Ondex project) , 2011, ICDE 2012.

[15]  Arnon Rosenthal,et al.  XML's Impact on Databases and Data Sharing , 2001, Computer.

[16]  Jacky L. Snoep,et al.  Reproducible computational biology experiments with SED-ML - The Simulation Experiment Description Markup Language , 2011, BMC Systems Biology.

[17]  J. Tyson,et al.  Modeling the control of DNA replication in fission yeast. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Michael Hucka,et al.  The Systems Biology Markup Language (SBML): Language Specification for Level 3 Version 1 Core , 2010, J. Integr. Bioinform..

[19]  Catherine M Lloyd,et al.  CellML: its future, present and past. , 2004, Progress in biophysics and molecular biology.

[20]  Norman W. Paton,et al.  SBRML: a markup language for associating systems biology data with models , 2010, Bioinform..

[21]  Dagmar Waltemath,et al.  Simulation Experiment Description Markup Language (SED-ML) Level 1 Version 2. , 2015, Journal of integrative bioinformatics.

[22]  Dan Brickley,et al.  Resource Description Framework (RDF) , 2017, Encyclopedia of GIS.

[23]  John H. Gennari,et al.  Multiple ontologies in action: Composite annotations for biosimulation models , 2011, J. Biomed. Informatics.

[24]  Matthew R. Pocock,et al.  Annotation of SBML models through rule-based semantic integration , 2009, J. Biomed. Semant..

[25]  Carole A. Goble,et al.  Structuring research methods and data with the research object model: genomics workflows as a case study , 2013, Journal of Biomedical Semantics.

[26]  Michel Dumontier,et al.  Controlled vocabularies and semantics in systems biology , 2011, Molecular systems biology.

[27]  Andreas Zell,et al.  JSBML: a flexible Java library for working with SBML , 2011, Bioinform..

[28]  Georg Lausen,et al.  RDFPath: Path Query Processing on Large RDF Graphs with MapReduce , 2011, ESWC Workshops.

[29]  Jim Webber,et al.  Graph Databases: New Opportunities for Connected Data , 2013 .

[30]  Melanie I. Stefan,et al.  BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models , 2010, BMC Systems Biology.

[31]  Edda Klipp,et al.  Systems Biology , 1994 .

[32]  Dan Brickley,et al.  Resource Description Framework (RDF) Model and Syntax Specification , 2002 .

[33]  Nicolas Le Novère,et al.  Structure, function, and behaviour of computational models in systems biology , 2013, BMC Systems Biology.

[34]  Michel Dumontier,et al.  Semantic Systems Biology: Formal Knowledge Representation in Systems Biology for Model Construction, Retrieval, Validation and Discovery , 2013 .

[35]  Carol Lushbough,et al.  Automatic biosystems comparison using semantic and name similarity , 2014, BCB.

[36]  Peter Buneman,et al.  Semistructured data , 1997, PODS.

[37]  Dagmar Waltemath,et al.  The CombineArchiveWeb Application - A Web-based Tool to Handle Files Associated with Modelling Results , 2014, SWAT4LS.

[38]  Kei-Hoi Cheung,et al.  BioPAX – A community standard for pathway data sharing , 2010, Nature Biotechnology.

[39]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[40]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[41]  Nicolas Le Novère,et al.  COMBINE Archive Specification Version 1 , 2015, J. Integr. Bioinform..

[42]  Dagmar Waltemath,et al.  A call for virtual experiments: accelerating the scientific process. , 2015, Progress in biophysics and molecular biology.

[43]  Gary R. Mirams,et al.  High-throughput functional curation of cellular electrophysiology models. , 2011, Progress in biophysics and molecular biology.

[44]  Nicolas Le Novère,et al.  BioModels linked dataset , 2014, BMC Systems Biology.

[45]  Hugh D. Spence,et al.  Minimum information requested in the annotation of biochemical models (MIRIAM) , 2005, Nature Biotechnology.

[46]  Olaf Wolkenhauer,et al.  Possibilities for Integrating Model-related Data in Computational Biology , 2013 .

[47]  Michael Hucka,et al.  A Profile of Today's SBML-Compatible Software , 2011, 2011 IEEE Seventh International Conference on e-Science Workshops.

[48]  Sarala M. Wimalaratne,et al.  The Systems Biology Graphical Notation , 2009, Nature Biotechnology.

[49]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993 .

[50]  Yixin Chen,et al.  A comparison of a graph database and a relational database: a data provenance perspective , 2010, ACM SE '10.

[51]  Natalya F. Noy,et al.  BioPortal: Ontologies and Integrated Data Resources at the Click of a Mouse , 2009 .

[52]  Nicolas Le Novère,et al.  Simulation Experiment Description Markup Language (SED-ML) : Level 1 Version 1 , 2011 .

[53]  Henggui Zhang,et al.  Cardiac cell modelling: observations from the heart of the cardiac physiome project. , 2011, Progress in biophysics and molecular biology.

[54]  Edmund J. Crampin,et al.  Minimum Information About a Simulation Experiment (MIASE) , 2011, PLoS Comput. Biol..

[55]  Andrew M. Jenkinson,et al.  The EBI RDF platform: linked open data for the life sciences , 2014, Bioinform..

[56]  GhemawatSanjay,et al.  The Google file system , 2003 .

[57]  J. Tyson Modeling the cell division cycle: cdc2 and cyclin interactions. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[58]  Edmund J. Crampin,et al.  Biophysical annotation and representation of CellML models , 2009, Bioinform..

[59]  Nicolas Le Novère,et al.  SBML Level 3 Package Proposal: Annotation , 2011 .

[60]  Olaf Wolkenhauer,et al.  Considerations of graph-based concepts to manage of computational biology models and associated simulations , 2012, GI-Jahrestagung.

[61]  H. Chandler Database , 1985 .

[62]  Nicolas Le Novère,et al.  Identifiers.org and MIRIAM Registry: community resources to provide persistent identification , 2011, Nucleic Acids Res..

[63]  Olaf Wolkenhauer,et al.  Improving the reuse of computational models through version control , 2013, Bioinform..

[64]  Michel Dumontier,et al.  Integrating systems biology models and biomedical ontologies , 2011, BMC Systems Biology.

[65]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[66]  Andreas Zell,et al.  Path2Models: large-scale generation of computational models from biochemical pathway maps , 2013, BMC Systems Biology.