Medusa: Software to build and analyze ensembles of genome-scale metabolic network reconstructions

Uncertainty in the structure and parameters of networks is ubiquitous across computational biology. In constraint-based reconstruction and analysis of metabolic networks, this uncertainty is present both during the reconstruction of networks and in simulations performed with them. Here, we present Medusa, a Python package for the generation and analysis of ensembles of genome-scale metabolic network reconstructions. Medusa builds on the COBRApy package for constraint-based reconstruction and analysis by compressing a set of models into a compact ensemble object, providing functions for the generation of ensembles using experimental data, and extending constraint-based analyses to ensemble scale. We demonstrate how Medusa can be used to generate ensembles and perform ensemble simulations, and how machine learning can be used in conjunction with Medusa to guide the curation of genome-scale metabolic network reconstructions. Medusa is available under the permissive MIT license from the Python Packaging Index (https://pypi.org) and from github (https://github.com/opencobra/Medusa), and comprehensive documentation is available at https://medusa.readthedocs.io/en/latest.

[1]  Nikolaus Sonnenschein,et al.  Optlang: An algebraic modeling language for mathematical optimization , 2017, J. Open Source Softw..

[2]  Jeffrey D Orth,et al.  What is flux balance analysis? , 2010, Nature Biotechnology.

[3]  Jan Schellenberger,et al.  Use of Randomized Sampling for Analysis of Metabolic Networks* , 2009, Journal of Biological Chemistry.

[4]  R. Mahadevan,et al.  The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. , 2003, Metabolic engineering.

[5]  B. Palsson,et al.  Systems approach to refining genome annotation , 2006, Proceedings of the National Academy of Sciences.

[6]  Zoran Nikoloski,et al.  On the effects of alternative optima in context-specific metabolic model predictions , 2016, PLoS Comput. Biol..

[7]  Jason A. Papin,et al.  Inferring Metabolic Mechanisms of Interaction within a Defined Gut Microbiota. , 2018, Cell systems.

[8]  D. Machado,et al.  Fast automated reconstruction of genome-scale metabolic models for microbial species and communities , 2018, bioRxiv.

[9]  Gregory L Medlock,et al.  Guiding the Refinement of Biochemical Knowledgebases with Ensembles of Metabolic Networks and Machine Learning , 2019, Cell systems.

[10]  Michael P H Stumpf,et al.  Topological sensitivity analysis for systems biology , 2014, Proceedings of the National Academy of Sciences.

[11]  Joshua A. Lerman,et al.  COBRApy: COnstraints-Based Reconstruction and Analysis for Python , 2013, BMC Systems Biology.

[12]  Julio R. Banga,et al.  Optimization in computational systems biology , 2008, BMC Systems Biology.

[13]  Jason A. Papin,et al.  Applications of genome-scale metabolic reconstructions , 2009, Molecular systems biology.

[14]  Paul D. W. Kirk,et al.  Model Selection in Systems Biology Depends on Experimental Design , 2014, PLoS Comput. Biol..

[15]  Michael B. Yaffe,et al.  Data-driven modelling of signal-transduction networks , 2006, Nature Reviews Molecular Cell Biology.

[16]  Martijn A. Huynen,et al.  Inferring Metabolic States in Uncharacterized Environments Using Gene-Expression Measurements , 2013, PLoS Comput. Biol..

[17]  Jason A. Papin,et al.  Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA , 2016, bioRxiv.

[18]  R. Overbeek,et al.  Automated genome annotation and metabolic model reconstruction in the SEED and Model SEED. , 2013, Methods in molecular biology.

[19]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[20]  H. Kitano Systems Biology: A Brief Overview , 2002, Science.

[21]  D. Kirschner,et al.  A methodology for performing global uncertainty and sensitivity analysis in systems biology. , 2008, Journal of theoretical biology.

[22]  Lin Wang,et al.  A review of computational tools for design and reconstruction of metabolic pathways , 2017, Synthetic and systems biotechnology.

[23]  Jason A. Papin,et al.  Integration of expression data in genome-scale metabolic network reconstructions , 2012, Front. Physio..

[24]  Peter D. Karp,et al.  How accurate is automated gap filling of metabolic models? , 2018, BMC Systems Biology.

[25]  Daniel Machado,et al.  Systematic Evaluation of Methods for Integration of Transcriptomic Data into Constraint-Based Models of Metabolism , 2014, PLoS Comput. Biol..

[26]  Andy R. Terrel,et al.  SymPy: Symbolic computing in Python , 2017, PeerJ Prepr..

[27]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.