Harmonizing semantic annotations for computational models in biology

Abstract Life science researchers use computational models to articulate and test hypotheses about the behavior of biological systems. Semantic annotation is a critical component for enhancing the interoperability and reusability of such models as well as for the integration of the data needed for model parameterization and validation. Encoded as machine-readable links to knowledge resource terms, semantic annotations describe the computational or biological meaning of what models and data represent. These annotations help researchers find and repurpose models, accelerate model composition and enable knowledge integration across model repositories and experimental data stores. However, realizing the potential benefits of semantic annotation requires the development of model annotation standards that adhere to a community-based annotation protocol. Without such standards, tool developers must account for a variety of annotation formats and approaches, a situation that can become prohibitively cumbersome and which can defeat the purpose of linking model elements to controlled knowledge resource terms. Currently, no consensus protocol for semantic annotation exists among the larger biological modeling community. Here, we report on the landscape of current annotation practices among the COmputational Modeling in BIology NEtwork community and provide a set of recommendations for building a consensus approach to semantic annotation.

[1]  Nicolas Le Novère,et al.  Ranked retrieval of Computational Biology models , 2010, BMC Bioinformatics.

[2]  Peter J. Hunter,et al.  CellML metadata standards, associated tools and repositories , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[3]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[4]  Stian Soiland-Reyes,et al.  PAV ontology: provenance, authoring and versioning , 2013, J. Biomed. Semant..

[5]  Peter J. Hunter,et al.  OpenCOR: a modular and interoperable approach to computational biology , 2015, Front. Physiol..

[6]  Robert C. Cannon,et al.  LEMS: a language for expressing complex biological models in concise and hierarchical form and its use in underpinning NeuroML 2 , 2014, Front. Neuroinform..

[7]  Olaf Wolkenhauer,et al.  How Modeling Standards, Software, and Initiatives Support Reproducibility in Systems Biology and Systems Medicine , 2016, IEEE Transactions on Biomedical Engineering.

[8]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[9]  Roland Eils,et al.  BioModels: expanding horizons to include more modelling approaches and formats , 2017, Nucleic Acids Res..

[10]  Andreas Zell,et al.  KEGGtranslator: visualizing and converting the KEGG PATHWAY database to various formats , 2011, Bioinform..

[11]  Chris J Myers,et al.  A Converter from the Systems Biology Markup Language to the Synthetic Biology Open Language. , 2016, ACS synthetic biology.

[12]  Mudita Singhal,et al.  COPASI - a COmplex PAthway SImulator , 2006, Bioinform..

[13]  G Stix,et al.  The mice that warred. , 2001, Scientific American.

[14]  Edda Klipp,et al.  Annotation and merging of SBML models with semanticSBML , 2010, Bioinform..

[15]  Zhen Zhang,et al.  Generating Systems Biology Markup Language Models from the Synthetic Biology Open Language. , 2015, ACS synthetic biology.

[16]  Hugh D. Spence,et al.  Minimum information requested in the annotation of biochemical models (MIRIAM) , 2005, Nature Biotechnology.

[17]  James Hetherington,et al.  Computational challenges of systems biology , 2004, Computer.

[18]  Michael Hucka,et al.  LibSBML: an API Library for SBML , 2008, Bioinform..

[19]  Gabriel A. Wainer,et al.  PROVENANCE IN MODELING AND SIMULATION STUDIES – BRIDGING GAPS , 2017 .

[20]  Andreas Zell,et al.  Path2Models: large-scale generation of computational models from biochemical pathway maps , 2013, BMC Systems Biology.

[21]  Vincent Danos,et al.  Scalable Simulation of Cellular Signaling Networks , 2007, APLAS.

[22]  Bernd Rinn,et al.  FAIRDOMHub: a repository and collaboration environment for sharing systems biology research , 2016, Nucleic Acids Res..

[23]  Chris J. Myers,et al.  JSBML 1.0: providing a smorgasbord of options to encode systems biology models , 2015, Bioinform..

[24]  Vincent Danos,et al.  Annotation of rule-based models with formal semantics to enable creation, analysis, reuse and visualization , 2015, Bioinform..

[25]  John H. Gennari,et al.  Physical Properties of Biological Entities: An Introduction to the Ontology of Physics for Biology , 2011, PloS one.

[26]  Goksel Misirli,et al.  Tuning receiver characteristics in bacterial quorum communication: An evolutionary approach using standard virtual biological parts , 2014, 2014 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology.

[27]  Jérôme Euzenat,et al.  Ontology Matching: State of the Art and Future Challenges , 2013, IEEE Transactions on Knowledge and Data Engineering.

[28]  A. Rector,et al.  Relations in biomedical ontologies , 2005, Genome Biology.

[29]  Philip Miller,et al.  BiGG Models: A platform for integrating, standardizing and sharing genome-scale models , 2015, Nucleic Acids Res..

[30]  A. Zell,et al.  ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis , 2016, PloS one.

[31]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[32]  S. Lewis,et al.  Uberon, an integrative multi-species anatomy ontology , 2012, Genome Biology.

[33]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[34]  John M. Hancock,et al.  A kinetic core model of the glucose-stimulated insulin secretion network of pancreatic β cells , 2007, Mammalian Genome.

[35]  On the road to robust data citation , 2018, Scientific data.

[36]  Michael Hucka,et al.  The Systems Biology Markup Language (SBML): Language Specification for Level 3 Version 1 Core , 2010, J. Integr. Bioinform..

[37]  Jacky L. Snoep,et al.  The JWS online simulation database , 2017, Bioinform..

[38]  John Kunze,et al.  Uniform resolution of compact identifiers for biomedical data , 2017, Scientific Data.

[39]  Ron Henkel,et al.  Notions of similarity for systems biology models , 2016, Briefings Bioinform..

[40]  Paul T. Groth,et al.  Provenance: An Introduction to PROV , 2013, Provenance.

[41]  Alexander G. Fletcher,et al.  MultiCellDS: a standard and a community for sharing multicellular data , 2016, bioRxiv.

[42]  John H. Gennari,et al.  Ontology of physics for biology: representing physical dependencies as a basis for biological processes , 2013, Journal of Biomedical Semantics.

[43]  John H. Gennari,et al.  A Reappraisal of How to Build Modular, Reusable Models of Biological Systems , 2014, PLoS Comput. Biol..

[44]  Paul N. Schofield,et al.  The role of ontologies in biological and biomedical research: a functional perspective , 2015, Briefings Bioinform..

[45]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[46]  Nicolas Le Novère,et al.  COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project , 2014, BMC Bioinformatics.

[47]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[48]  Ingo Weber,et al.  User-Friendly Semantic Annotation in Business Process Modeling , 2007, WISE Workshops.

[49]  Peter J. Hunter,et al.  The CellML Metadata Framework 2.0 Specification , 2015, J. Integr. Bioinform..

[50]  Samik Ghosh,et al.  Modeling and simulation using CellDesigner. , 2014, Methods in molecular biology.

[51]  Peter J. Hunter,et al.  An Overview of CellML 1.1, a Biological Model Description Language , 2003, Simul..

[52]  Olaf Wolkenhauer,et al.  COMODI: an ontology to characterise differences in versions of computational models in biology , 2016, Journal of Biomedical Semantics.

[53]  Cornelius Rosse,et al.  A Reference Ontology for Bioinformatics: The Foundational Model of Anatomy , 2003 .

[54]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[55]  James R Faeder,et al.  Rule-based modeling of biochemical systems with BioNetGen. , 2009, Methods in molecular biology.

[56]  Morgan Taschuk,et al.  Saint: a lightweight integration environment for model annotation , 2009, Bioinform..

[57]  José L. V. Mejino,et al.  A reference ontology for biomedical informatics: the Foundational Model of Anatomy , 2003, J. Biomed. Informatics.

[58]  J C Schaff,et al.  Integrating BioPAX pathway knowledge with SBML models. , 2009, IET systems biology.

[59]  David Sánchez,et al.  An ontology-based measure to compute semantic similarity in biomedicine , 2011, J. Biomed. Informatics.

[60]  J Geoffrey Chase,et al.  Minimal haemodynamic system model including ventricular interaction and valve dynamics. , 2004, Medical engineering & physics.

[61]  Jacky L. Snoep,et al.  Web-based kinetic modelling using JWS Online , 2004, Bioinform..

[62]  Edmund J. Crampin,et al.  A method for visualizing CellML models , 2009, Bioinform..

[63]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[64]  Nicolas Le Novère,et al.  Data Integration and Semantic Enrichment of Systems Biology Models and Simulations , 2009, DILS.

[65]  Dagmar Waltemath,et al.  A call for virtual experiments: accelerating the scientific process. , 2015, Progress in biophysics and molecular biology.

[66]  Jacky L. Snoep,et al.  Reproducible computational biology experiments with SED-ML - The Simulation Experiment Description Markup Language , 2011, BMC Systems Biology.

[67]  Olaf Wolkenhauer,et al.  An algorithm to detect and communicate the differences in computational models describing biological systems , 2015, Bioinform..

[68]  Chris T. A. Evelo,et al.  The systems biology format converter , 2016, BMC Bioinformatics.

[69]  Peter J. Hunter,et al.  Bioinformatics Applications Note Databases and Ontologies the Physiome Model Repository 2 , 2022 .

[70]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[71]  Kurt Sandkuhl,et al.  Identifying frequent patterns in biochemical reaction networks: a workflow , 2018, PeerJ Prepr..

[72]  Goksel Misirli,et al.  Model annotation for synthetic biology: automating model to nucleotide sequence conversion , 2011, Bioinform..

[73]  Nicolas Le Novère,et al.  Identifiers.org and MIRIAM Registry: community resources to provide persistent identification , 2011, Nucleic Acids Res..

[74]  Allan Kuchinsky,et al.  The Synthetic Biology Open Language (SBOL) provides a community standard for communicating designs in synthetic biology , 2014, Nature Biotechnology.

[75]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[76]  Michael L. Hines,et al.  NeuroML: A Language for Describing Data Driven Models of Neurons and Networks with a High Degree of Biological Detail , 2010, PLoS Comput. Biol..

[77]  Ioannis Xenarios,et al.  SourceData: a semantic platform for curating and searching figures , 2016, Nature Methods.

[78]  Alexander Mazein,et al.  STON: exploring biological pathways using the SBGN standard and graph databases , 2016, BMC Bioinformatics.

[79]  Edmund J. Crampin,et al.  Semantics-Based Composition of Integrated Cardiomyocyte Models Motivated by Real-World Use Cases , 2015, PloS one.

[80]  Carole Goble,et al.  The Human Physiome: how standards, software and innovative service infrastructures are providing the building blocks to make it achievable , 2016, Interface Focus.

[81]  Olaf Wolkenhauer,et al.  Combining computational models, semantic annotations and simulation experiments in a graph database , 2015, Database J. Biol. Databases Curation.

[82]  Krzysztof Janowicz,et al.  Collaborative Ontology Development for the Geosciences , 2014, Trans. GIS.

[83]  Olaf Wolkenhauer,et al.  Annotation-based feature extraction from sets of SBML models , 2014, Journal of Biomedical Semantics.

[84]  Lei Shi,et al.  SABIO-RK—database for biochemical reaction kinetics , 2011, Nucleic Acids Res..

[85]  Goksel Misirli,et al.  Composable Modular Models for Synthetic Biology , 2014, ACM J. Emerg. Technol. Comput. Syst..

[86]  Michel Dumontier,et al.  Controlled vocabularies and semantics in systems biology , 2011, Molecular systems biology.

[87]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[88]  Gary D. Bader,et al.  Promoting Coordinated Development of Community-Based Information Standards for Modeling in Biology: The COMBINE Initiative , 2015, Front. Bioeng. Biotechnol..

[89]  Carole A. Goble,et al.  Bioschemas: From Potato Salad to Protein Annotation , 2017, SEMWEB.

[90]  Andreas Zell,et al.  Qualitative translation of relations from BioPAX to SBML qual , 2012, Bioinform..

[91]  John H. Gennari,et al.  Multiple ontologies in action: Composite annotations for biosimulation models , 2011, J. Biomed. Informatics.

[92]  John H. Gennari,et al.  Qualitative Causal Analyses of Biosimulation Models , 2016, ICBO/BioCreative.

[93]  E. Klipp,et al.  Retrieval, alignment, and clustering of computational models based on semantic annotations , 2011, Molecular systems biology.

[94]  Mary Shimoyama,et al.  Multiscale Modeling and Data Integration in the Virtual Physiological Rat Project , 2012, Annals of Biomedical Engineering.

[95]  Gary D Bader,et al.  BioPAX – A community standard for pathway data sharing , 2010, Nature Biotechnology.

[96]  Jian Zhang,et al.  Protein Ontology (PRO): enhancing and scaling up the representation of protein entities , 2016, Nucleic Acids Res..

[97]  Sarala M. Wimalaratne,et al.  The Systems Biology Graphical Notation , 2009, Nature Biotechnology.

[98]  Minoru Kanehisa,et al.  KEGG: new perspectives on genomes, pathways, diseases and drugs , 2016, Nucleic Acids Res..