neo4jsbml: import systems biology markup language data into the graph database Neo4j

Systems Biology Markup Language (SBML) has emerged as a standard for representing biological models, facilitating model sharing and interoperability. It stores many types of data and complex relationships, complicating data management and analysis. Traditional database management systems struggle to effectively capture these complex networks of interactions within biological systems. Graph-oriented databases perform well in managing interactions between different entities. We present neo4jsbml, a new solution that bridges the gap between the Systems Biology Markup Language data and the Neo4j database, for storing, querying and analyzing data. The Systems Biology Markup Language organizes biological entities in a hierarchical structure, reflecting their interdependencies. The inherent graphical structure represents these hierarchical relationships, offering a natural and efficient means of navigating and exploring the model’s components. Neo4j is an excellent solution for handling this type of data. By representing entities as nodes and their relationships as edges, Cypher, Neo4j’s query language, efficiently traverses this type of graph representing complex biological networks. We have developed neo4jsbml, a Python library for importing Systems Biology Markup Language data into a Neo4j database using a user-defined schema. By leveraging Neo4j’s graphical database technology, exploration of complex biological networks becomes intuitive and information retrieval efficient. Neo4jsbml is a tool designed to import Systems Biology Markup Language data into a Neo4j database. Only the desired data is loaded into the Neo4j database. neo4jsbml is user-friendly and can become a useful new companion for visualizing and analyzing metabolic models through the Neo4j graphical database. neo4jsbml is open source software and available at https://github.com/brsynth/neo4jsbml.

[1]  H. Zeng,et al.  Flux balance analysis-based metabolic modeling of microbial secondary metabolism: Current status and outlook , 2023, PLoS Comput. Biol..

[2]  H. Sauro,et al.  Standards, dissemination, and best practices in systems biology. , 2023, Current opinion in biotechnology.

[3]  Guillermo Lorenzo,et al.  Integrating Quantitative Assays with Biologically Based Mathematical Modeling for Predictive Oncology , 2020, iScience.

[4]  M. Pagni,et al.  MetaNetX/MNXref: unified namespace for metabolites and biochemical reactions in the context of metabolic models , 2020, Nucleic acids research.

[5]  Bernhard O. Palsson,et al.  MASSpy: Building, simulating, and visualizing dynamic biological models in Python using mass action kinetics , 2020, bioRxiv.

[6]  A. Garny,et al.  CellML 2.0 , 2020, Journal of Integrative Bioinformatics.

[7]  J. Snoep,et al.  A combined experimental and modelling approach for the Weimberg pathway optimisation , 2020, Nature Communications.

[8]  Herbert M. Sauro,et al.  Tellurium: An extensible python-based modeling environment for systems and synthetic biology , 2018, Biosyst..

[9]  Kurt Sandkuhl,et al.  Identifying frequent patterns in biochemical reaction networks: a workflow , 2018, PeerJ Prepr..

[10]  Chris J. Myers,et al.  The Systems Biology Markup Language (SBML): Language Specification for Level 3 Version 2 Core Release 2 , 2018, J. Integr. Bioinform..

[11]  B. Olivier,et al.  SBML Level 3 Package: Flux Balance Constraints version 2 , 2018, Journal of Integrative Bioinformatics.

[12]  Zachary A. King,et al.  iML1515, a knowledgebase that computes Escherichia coli traits , 2017, Nature Biotechnology.

[13]  Sophia Ananiadou,et al.  biochem4j: Integrated and extensible biochemical knowledge through graph databases , 2017, PloS one.

[14]  Christopher J. Rawlings,et al.  Recon2Neo4j: applying graph database technologies for managing comprehensive genome-scale networks , 2016, Bioinform..

[15]  Michael Hucka,et al.  SBML Level 3 package: Groups, Version 1 Release 1 , 2016, J. Integr. Bioinform..

[16]  Philip Miller,et al.  BiGG Models: A platform for integrating, standardizing and sharing genome-scale models , 2015, Nucleic Acids Res..

[17]  Zachary A. King,et al.  Escher: A Web Application for Building, Sharing, and Embedding Data-Rich Visualizations of Biological Pathways , 2015, PLoS Comput. Biol..

[18]  Chris J. Myers,et al.  JSBML 1.0: providing a smorgasbord of options to encode systems biology models , 2015, Bioinform..

[19]  Aurélien Naldi,et al.  SBML Level 3 package: Qualitative Models, Version 1, Release 1 , 2015, Journal of integrative bioinformatics.

[20]  Olaf Wolkenhauer,et al.  Combining computational models, semantic annotations and simulation experiments in a graph database , 2015, Database J. Biol. Databases Curation.

[21]  Jacky L. Snoep,et al.  Reproducible computational biology experiments with SED-ML - The Simulation Experiment Description Markup Language , 2011, BMC Systems Biology.

[22]  Jörg Stelling,et al.  Large-scale computation of elementary flux modes with bit pattern trees , 2008, Bioinform..

[23]  Michael Hucka,et al.  LibSBML: an API Library for SBML , 2008, Bioinform..

[24]  Adam M. Feist,et al.  A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information , 2007, Molecular systems biology.

[25]  Andreas Meier,et al.  Ensuring Data Consistency , 2019, SQL & NoSQL Databases.

[26]  Sarah M. Keating,et al.  Systems Biology Markup Language (SBML) Level 2 Version 5: Structures and Facilities for Model Definitions , 2015, J. Integr. Bioinform..

[27]  Frank T. Bergmann,et al.  The Systems Biology Markup Language (SBML) Level 3 Package: Layout, Version 1 Core , 2015, J. Integr. Bioinform..