COMBINE archive: One File To Share Them All

Background: With the ever increasing use of computational models in the biosciences, the need to efficiently and easily share models and reproduce the results of published studies is becoming more important. As part of this effort, various standards have been proposed that can be used to describe models, simulations, data or other essential information. These constitute various separate components that are required to reproduce a published scientific result. Results: In this work we describe the Open Modeling EXchange format (OMEX) that allows all the necessary information to be bundled together into one file. Together with the use of other COMBINE standard formats, OMEX is the basis of the COMBINE archive, a single file that supports the exchange of all the information necessary for a modeling and simulation experiment in biology. An OMEX file is a ZIP container that includes a manifest file, listing the content of the archive, an optional metadata file adding information about the archive and its content, and the files describing the model. The content of a COMBINE archive consists of files encoded in COMBINE standards whenever possible, but may include additional files defined by an Internet Media Type. Several tools supporting the COMBINE archive are available, either as independent libraries or embedded in modeling software. Conclusions: The COMBINE archive facilitates the reproduction of modeling and simulation experiments in biology by embedding all the relevant information in one file. Having all the information stored and exchanged at once also helps support building logs and audit trails. We anticipate that the COMBINE archive will become a significant help, as the domain moves to larger, more complex experiments such as multi-scale models of organs, digital organisms, and bioengineering.

[1]  Peter J. Hunter,et al.  Revision history aware repositories of computational models of biological systems , 2011, BMC Bioinformatics.

[2]  Jacky L. Snoep,et al.  Reproducible computational biology experiments with SED-ML - The Simulation Experiment Description Markup Language , 2011, BMC Systems Biology.

[3]  Ronan M. T. Fleming,et al.  A community-driven global reconstruction of human metabolism , 2013, Nature Biotechnology.

[4]  Nigel H. Goddard,et al.  Towards NeuroML: model description methods for collaborative modelling in neuroscience. , 2001, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[5]  Jan-Hendrik S. Hofmeyr,et al.  Modelling cellular systems with PySCeS , 2005, Bioinform..

[6]  Norman W. Paton,et al.  SBRML: a markup language for associating systems biology data with models , 2010, Bioinform..

[7]  Hugh D. Spence,et al.  Minimum information requested in the annotation of biochemical models (MIRIAM) , 2005, Nature Biotechnology.

[8]  Yukiko Matsuoka,et al.  Software support for SBGN maps: SBGN-ML and LibSBGN , 2012, Bioinform..

[9]  Roy T. Fielding,et al.  Uniform Resource Identifiers (URI): Generic Syntax , 1998, RFC.

[10]  Birgit Müller,et al.  A standard protocol for describing individual-based and agent-based models , 2006 .

[11]  Peter J. Hunter,et al.  FieldML: concepts and implementation , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[12]  R. Guthke,et al.  Integrated metabolic spatial‐temporal model for the prediction of ammonia detoxification during liver damage and regeneration , 2014, Hepatology.

[13]  Gary R. Mirams,et al.  High-throughput functional curation of cellular electrophysiology models. , 2011, Progress in biophysics and molecular biology.

[14]  David Nickerson,et al.  Practical application of CellML 1.1: The integration of new mechanisms into a human ventricular myocyte model. , 2008, Progress in biophysics and molecular biology.

[15]  Sarala M. Wimalaratne,et al.  The Systems Biology Graphical Notation , 2009, Nature Biotechnology.

[16]  L. Loew,et al.  The Virtual Cell: a software environment for computational cell biology. , 2001, Trends in biotechnology.

[17]  Michel Dumontier,et al.  Controlled vocabularies and semantics in systems biology , 2011, Molecular systems biology.

[18]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[19]  Melanie I. Stefan,et al.  BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models , 2010, BMC Systems Biology.

[20]  Ned Freed,et al.  Media Type Specifications and Registration Procedures , 2005, RFC.

[21]  Edmund J. Crampin,et al.  Minimum Information About a Simulation Experiment (MIASE) , 2011, PLoS Comput. Biol..

[22]  Carole Goble,et al.  The SEEK: a platform for sharing data and models in systems biology. , 2011, Methods in enzymology.

[23]  Olaf Wolkenhauer,et al.  Considerations of graph-based concepts to manage of computational biology models and associated simulations , 2012, GI-Jahrestagung.

[24]  W. J. Hedley,et al.  A short introduction to CellML , 2001, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[25]  Nicolas Le Novère,et al.  Integration of Biochemical and Electrical Signaling-Multiscale Model of the Medium Spiny Neuron of the Striatum , 2013, PloS one.

[26]  Anton Nekrutenko,et al.  Ten Simple Rules for Reproducible Computational Research , 2013, PLoS Comput. Biol..

[27]  Jonathan R. Karr,et al.  A Whole-Cell Computational Model Predicts Phenotype from Genotype , 2012, Cell.

[28]  K. Smallbone Striking a balance with Recon 2.1 , 2013, 1311.5696.

[29]  Nicolas Le Novère,et al.  Identifiers.org and MIRIAM Registry: community resources to provide persistent identification , 2011, Nucleic Acids Res..

[30]  Olaf Wolkenhauer,et al.  Improving the reuse of computational models through version control , 2013, Bioinform..

[31]  Allan Kuchinsky,et al.  The Synthetic Biology Open Language (SBOL) provides a community standard for communicating designs in synthetic biology , 2014, Nature Biotechnology.

[32]  Erik Butterworth,et al.  JSim, an open-source modeling system for data analysis , 2013, F1000Research.

[33]  Jill P Mesirov,et al.  Accessible Reproducible Research , 2010, Science.