Management of simulation studies in computational biology

Data management is a well defined task in computer science which investigates methods for organising and controlling the information generated during (research) projects. It comprises several tasks, including data storage, search, retrieval, version control and provenance. Effective data management strategies for computational biology are needed to handle the increasing amount of data that is being generated and processed: High-throughput experiments generate large amounts of data; computational models become complex; novel methods for model coupling enable researchers to combine models into even larger systems; increasing computational power allows for complex simulations; and the availability of data at different scales demands clever integration techniques. However, recent studies showed that the rate of reproducibility of scientific results in the life sciences, including computational biology, is not acceptable [Ioa14]. As a consequence, efforts have been launched to improve reusability and reproducibility of biomedical results (e. g., [M14, Ioa14]), and results of simulation studies in particular [W11a, B14, C15]. Today, paths towards improved data management are discussed by funders and publishers, in large scale projects and by individual researchers. For example, funders established policies such as the ERASysAPP Data Management Guidelines; the German Network for Bioinformatics Infrastructure, de.NBI (http://www.denbi.de) has dedicated data management centers; and projects are funded to develop support for sustainable data management, e. g., FAIR-DOM (http://fair-dom.org).

[1]  Peter J. Hunter,et al.  An Overview of CellML 1.1, a Biological Model Description Language , 2003, Simul..

[2]  Carole A. Goble,et al.  SEEK: a systems biology data and model management platform , 2015, BMC Systems Biology.

[3]  Olaf Wolkenhauer,et al.  Improving the reuse of computational models through version control , 2013, Bioinform..

[4]  John P. A. Ioannidis,et al.  How to Make More Published Research True , 2014, PLoS medicine.

[5]  A Goldbeter,et al.  A minimal cascade model for the mitotic oscillator involving cyclin and cdc2 kinase. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Sarala M. Wimalaratne,et al.  The Systems Biology Graphical Notation , 2009, Nature Biotechnology.

[7]  Gary R. Mirams,et al.  High-throughput functional curation of cellular electrophysiology models. , 2011, Progress in biophysics and molecular biology.

[8]  Melanie I. Stefan,et al.  BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models , 2010, BMC Systems Biology.

[9]  Chris J. Myers,et al.  Meeting report from the fourth meeting of the Computational Modeling in Biology Network (COMBINE) , 2011, Standards in Genomic Sciences.

[10]  BMC Bioinformatics , 2005 .

[11]  Jacky L. Snoep,et al.  Reproducible computational biology experiments with SED-ML - The Simulation Experiment Description Markup Language , 2011, BMC Systems Biology.

[12]  Dagmar Waltemath,et al.  A call for virtual experiments: accelerating the scientific process. , 2015, Progress in biophysics and molecular biology.

[13]  Dagmar Waltemath,et al.  Extracting reproducible simulation studies from model repositories using the combinearchive toolkit , 2015, BTW Workshops.

[14]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[15]  Olaf Wolkenhauer,et al.  An algorithm to detect and communicate the differences in computational models describing biological systems , 2015, Bioinform..

[16]  C. Daub,et al.  BMC Systems Biology , 2007 .

[17]  Published Online Biomedical research: increasing value, reducing waste , 2014 .

[18]  Michel Dumontier,et al.  Controlled vocabularies and semantics in systems biology , 2011, Molecular systems biology.

[19]  Olaf Wolkenhauer,et al.  Combining computational models, semantic annotations and simulation experiments in a graph database , 2015, Database J. Biol. Databases Curation.

[20]  Edmund J. Crampin,et al.  Minimum Information About a Simulation Experiment (MIASE) , 2011, PLoS Comput. Biol..

[21]  Dagmar Waltemath,et al.  How Can Semantic Annotations Support the Identification of Network Similarities? , 2014, SWAT4LS.

[22]  Olaf Wolkenhauer,et al.  Annotation-based feature extraction from sets of SBML models , 2014, Journal of Biomedical Semantics.

[23]  Nicolas Le Novère,et al.  COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project , 2014, BMC Bioinformatics.

[24]  A. Pühler,et al.  Molecular systems biology , 2007 .

[25]  R. Bogacz,et al.  Progress in Biophysics and Molecular Biology , 2010 .

[26]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[27]  Nicolas Le Novère,et al.  Ranked retrieval of Computational Biology models , 2010, BMC Bioinformatics.

[28]  Peter J. Hunter,et al.  Bioinformatics Applications Note Databases and Ontologies the Physiome Model Repository 2 , 2022 .