Evolution of computational models in BioModels Database and the Physiome Model Repository

BackgroundA useful model is one that is being (re)used. The development of a successful model does not finish with its publication. During reuse, models are being modified, i.e. expanded, corrected, and refined. Even small changes in the encoding of a model can, however, significantly affect its interpretation. Our motivation for the present study is to identify changes in models and make them transparent and traceable.MethodsWe analysed 13734 models from BioModels Database and the Physiome Model Repository. For each model, we studied the frequencies and types of updates between its first and latest release. To demonstrate the impact of changes, we explored the history of a Repressilator model in BioModels Database.ResultsWe observed continuous updates in the majority of models. Surprisingly, even the early models are still being modified. We furthermore detected that many updates target annotations, which improves the information one can gain from models. To support the analysis of changes in model repositories we developed MoSt, an online tool for visualisations of changes in models. The scripts used to generate the data and figures for this study are available from GitHub github.com/binfalse/BiVeS-StatsGenerator and as a Docker image at hub.docker.com/r/binfalse/bives-statsgenerator. The website most.bio.informatik.uni-rostock.de provides interactive access to model versions and their evolutionary statistics.ConclusionThe reuse of models is still impeded by a lack of trust and documentation. A detailed and transparent documentation of all aspects of the model, including its provenance, will improve this situation. Knowledge about a model’s provenance can avoid the repetition of mistakes that others already faced. More insights are gained into how the system evolves from initial findings to a profound understanding. We argue that it is the responsibility of the maintainers of model repositories to offer transparent model provenance to their users.

[1]  Deborah L. McGuinness,et al.  PROV-O: The PROV Ontology , 2013 .

[2]  Sarala M. Wimalaratne,et al.  The Systems Biology Graphical Notation , 2009, Nature Biotechnology.

[3]  Michael Hucka,et al.  The Systems Biology Markup Language (SBML): Language Specification for Level 3 Version 1 Core (Release 1 Candidate) , 2010 .

[4]  Olaf Wolkenhauer,et al.  Improving the reuse of computational models through version control , 2013, Bioinform..

[5]  Michael Hucka,et al.  The Systems Biology Markup Language (SBML): Language Specification for Level 3 Version 1 Core , 2010, J. Integr. Bioinform..

[6]  Carlos González-Alcón,et al.  Modeling of leishmaniasis infection dynamics: novel application to the design of effective therapies , 2012, BMC Systems Biology.

[7]  Jeffrey Heer,et al.  SpanningAspectRatioBank Easing FunctionS ArrayIn ColorIn Date Interpolator MatrixInterpola NumObjecPointI Rectang ISchedu Parallel Pause Scheduler Sequen Transition Transitioner Transiti Tween Co DelimGraphMLCon IData JSONCon DataField DataSc Dat DataSource Data DataUtil DirtySprite LineS RectSprite , 2011 .

[8]  Ron Henkel,et al.  Notions of similarity for systems biology models , 2016, Briefings Bioinform..

[9]  Jacky L. Snoep,et al.  BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems , 2005, Nucleic Acids Res..

[10]  Falk Schreiber,et al.  Editing, validating and translating of SBGN maps , 2010, Bioinform..

[11]  A. Millar,et al.  The clock gene circuit in Arabidopsis includes a repressilator with additional feedback loops , 2012, Molecular systems biology.

[12]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[13]  Jeffrey Heer,et al.  D³ Data-Driven Documents , 2011, IEEE Transactions on Visualization and Computer Graphics.

[14]  Yangyang Zhao,et al.  BioModels: ten-year anniversary , 2014, Nucleic Acids Res..

[15]  Peter J. Hunter,et al.  Bioinformatics Applications Note Databases and Ontologies the Physiome Model Repository 2 , 2022 .

[16]  Olaf Wolkenhauer,et al.  An algorithm to detect and communicate the differences in computational models describing biological systems , 2015, Bioinform..

[17]  I. Nookaew,et al.  Integration of clinical data with a genome-scale metabolic model of the human adipocyte , 2013, Molecular systems biology.

[18]  M. Elowitz,et al.  A synthetic oscillatory network of transcriptional regulators , 2000, Nature.

[19]  Matthias Klapperstück,et al.  VANTED v2: a framework for systems biology applications , 2012, BMC Systems Biology.

[20]  Nicolas Le Novère,et al.  Ranked retrieval of Computational Biology models , 2010, BMC Bioinformatics.

[21]  Maria Liakata,et al.  On the formalization and reuse of scientific research , 2011, Journal of The Royal Society Interface.

[22]  Ronan M. T. Fleming,et al.  A community-driven global reconstruction of human metabolism , 2013, Nature Biotechnology.

[23]  Hugh D. Spence,et al.  Minimum information requested in the annotation of biochemical models (MIRIAM) , 2005, Nature Biotechnology.

[24]  Peter J. Hunter,et al.  An Overview of CellML 1.1, a Biological Model Description Language , 2003, Simul..

[25]  Olaf Wolkenhauer,et al.  COMODI: an ontology to characterise differences in versions of computational models in biology , 2016, Journal of Biomedical Semantics.

[26]  Taesung Park,et al.  Phenotype prediction from genome-wide association studies: application to smoking behaviors , 2012, BMC Systems Biology.

[27]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[28]  Peter J. Hunter,et al.  Revision history aware repositories of computational models of biological systems , 2011, BMC Bioinformatics.

[29]  Melanie I. Stefan,et al.  BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models , 2010, BMC Systems Biology.