Workflow for Data Analysis in Experimental and Computational Systems Biology: Using Python as ‘Glue’

Bottom-up systems biology entails the construction of kinetic models of cellular pathways by collecting kinetic information on the pathway components (e.g., enzymes) and collating this into a kinetic model, based for example on ordinary differential equations. This requires integration and data transfer between a variety of tools, ranging from data acquisition in kinetics experiments, to fitting and parameter estimation, to model construction, evaluation and validation. Here, we present a workflow that uses the Python programming language, specifically the modules from the SciPy stack, to facilitate this task. Starting from raw kinetics data, acquired either from spectrophotometric assays with microtitre plates or from Nuclear Magnetic Resonance (NMR) spectroscopy time-courses, we demonstrate the fitting and construction of a kinetic model using scientific Python tools. The analysis takes place in a Jupyter notebook, which keeps all information related to a particular experiment together in one place and thus serves as an e-labbook, enhancing reproducibility and traceability. The Python programming language serves as an ideal foundation for this framework because it is powerful yet relatively easy to learn for the non-programmer, has a large library of scientific routines and active user community, is open-source and extensible, and many computational systems biology software tools are written in Python or have a Python Application Programming Interface (API). Our workflow thus enables investigators to focus on the scientific problem at hand rather than worrying about data integration between disparate platforms.

[1]  H. Kitano International alliances for quantitative modeling in systems biology , 2005, Molecular systems biology.

[2]  Hans V. Westerhoff,et al.  Systems Biology: Did we know it all along? , 2005 .

[3]  Mudita Singhal,et al.  COPASI - a COmplex PAthway SImulator , 2006, Bioinform..

[4]  B. Kholodenko,et al.  Quantification of Short Term Signaling by the Epidermal Growth Factor Receptor* , 1999, The Journal of Biological Chemistry.

[6]  Jacob Roll,et al.  Systems biology: model based evaluation and comparison of potential explanations for given biological data , 2009, The FEBS journal.

[7]  Travis E. Oliphant,et al.  Python for Scientific Computing , 2007, Computing in Science & Engineering.

[8]  Reinhart Heinrich,et al.  A linear steady-state treatment of enzymatic chains. General properties, control and effector strength. , 1974, European journal of biochemistry.

[9]  Brian E. Granger,et al.  IPython: A System for Interactive Scientific Computing , 2007, Computing in Science & Engineering.

[10]  Barbara M. Bakker,et al.  Measuring enzyme activities under standardized in vivo‐like conditions for systems biology , 2010, The FEBS journal.

[11]  Gaudenz Danuser,et al.  Linking data to models: data regression , 2006, Nature Reviews Molecular Cell Biology.

[12]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[13]  Konrad Hinsen,et al.  High-Level Scientific Programming with Python , 2002, International Conference on Computational Science.

[14]  J. Snoep,et al.  Evaluation of a simplified generic bi-substrate rate equation for computational systems biology. , 2006, Systems biology.

[15]  J. Rohwer,et al.  Identifying and characterising regulatory metabolites with generalised supply-demand analysis. , 2008, Journal of theoretical biology.

[16]  Robert W. Smith,et al.  DMPy: a Python package for automated mathematical model construction of large-scale metabolic systems , 2018, BMC Systems Biology.

[17]  B. G. Olivier,et al.  Modelling Cellular Processes With Python and Scipy , 2004, Molecular Biology Reports.

[18]  M. Poolman ScrumPy: metabolic modelling with Python. , 2006, Systems biology.

[19]  Stefan Behnel,et al.  Cython: The Best of Both Worlds , 2011, Computing in Science & Engineering.

[20]  H. Westerhoff,et al.  Why in vivo may not equal in vitro – new effectors revealed by measurement of enzymatic activities under the same in vivo‐like assay conditions , 2012, The FEBS journal.

[21]  Jacky L. Snoep,et al.  Web-based kinetic modelling using JWS Online , 2004, Bioinform..

[22]  Carole A. Goble,et al.  SEEK: a systems biology data and model management platform , 2015, BMC Systems Biology.

[23]  Herbert M. Sauro,et al.  Tellurium: A Python Based Modeling and Reproducibility Platform for Systems Biology , 2016, bioRxiv.

[24]  J. Hofmeyr,et al.  Regulating the cellular economy of supply and demand , 2000, FEBS letters.

[25]  F. Bruggeman,et al.  The nature of systems biology. , 2007, Trends in microbiology.

[26]  H. Kacser,et al.  The control of flux. , 1995, Biochemical Society transactions.

[27]  et al.,et al.  Jupyter Notebooks - a publishing format for reproducible computational workflows , 2016, ELPUB.

[28]  Joshua A. Lerman,et al.  COBRApy: COnstraints-Based Reconstruction and Analysis for Python , 2013, BMC Systems Biology.

[29]  Andy R. Terrel,et al.  SymPy: Symbolic computing in Python , 2017, PeerJ Prepr..

[30]  Johann J. Eicher Understanding glycolysis in Escherichia coli : a systems approach using nuclear magnetic resonance spectroscopy , 2013 .

[31]  Brian Ingalls,et al.  Mathematical Modeling in Systems Biology: An Introduction , 2013 .

[32]  Berk Ekmekci,et al.  An Introduction to Programming for Bioscientists: A Python-Based Primer , 2016, PLoS Comput. Biol..

[33]  James A. Glazier,et al.  libRoadRunner 2.0: a high performance SBML simulation and analysis library , 2022, Bioinformatics.

[34]  J. Rohwer,et al.  Kinetic and thermodynamic aspects of enzyme control and regulation. , 2010, The journal of physical chemistry. B.

[35]  Christiaan Swanepoel,et al.  A systematic investigation into the quantitative effect of pH changes on the upper glycolytic enzymes of Escherichia coli and Saccharomyces cerevisiae , 2018 .

[36]  Hans V Westerhoff,et al.  Towards building the silicon cell: a modular approach. , 2006, Bio Systems.

[37]  Michiel Kleerebezem,et al.  Metabolic engineering of lactic acid bacteria, the combined approach: kinetic modelling, metabolic control and experimental analysis. , 2002, Microbiology.

[38]  Oliver Ebenhöh,et al.  Building Mathematical Models of Biological Systems with modelbase , 2018 .

[39]  Carl D. Christensen,et al.  PySCeSToolbox: a collection of metabolic pathway analysis tools , 2018, Bioinform..

[40]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[41]  J. Hofmeyr,et al.  Metabolic control analysis in a nutshell , 2001 .

[42]  Pearu Peterson,et al.  F2PY: a tool for connecting Fortran and Python programs , 2009, Int. J. Comput. Sci. Eng..

[43]  Maksat Ashyraliyev,et al.  Systems biology: parameter estimation for biochemical models , 2009, The FEBS journal.

[44]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[45]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[46]  Jan-Hendrik S. Hofmeyr,et al.  Modelling cellular systems with PySCeS , 2005, Bioinform..

[47]  M. Newville,et al.  Lmfit: Non-Linear Least-Square Minimization and Curve-Fitting for Python , 2014 .

[48]  Athel Cornish-Bowden,et al.  Understanding the regulation of aspartate metabolism using a model based on measured kinetic parameters , 2009, Molecular systems biology.

[49]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[50]  J. Rohwer,et al.  Supply-demand analysis a framework for exploring the regulatory design of metabolism. , 2011, Methods in enzymology.

[51]  Bernd Rinn,et al.  FAIRDOMHub: a repository and collaboration environment for sharing systems biology research , 2016, Nucleic Acids Res..

[52]  Carl D. Christensen,et al.  Delving deeper: Relating the behaviour of a metabolic system to the properties of its components using symbolic metabolic control analysis , 2018, bioRxiv.

[53]  C Reder,et al.  Metabolic control theory: a structural approach. , 1988, Journal of theoretical biology.

[54]  Jacky L. Snoep,et al.  Determining Enzyme Kinetics for Systems Biology with Nuclear Magnetic Resonance Spectroscopy , 2012, Metabolites.

[55]  Robert D. Finn,et al.  The European Bioinformatics Institute in 2016: Data growth and integration , 2015, Nucleic Acids Res..

[56]  Carl D. Christensen,et al.  Tracing regulatory routes in metabolism using generalised supply-demand analysis , 2015, BMC Systems Biology.

[57]  J. Rohwer Kinetic modelling of plant metabolic pathways. , 2012, Journal of experimental botany.

[58]  D. Niekerk,et al.  Targeting glycolysis in the malaria parasite Plasmodium falciparum , 2016, The FEBS journal.

[59]  S. Welling-Wester,et al.  Determination of enzyme activity by high-performance liquid chromatography. , 1994, Journal of chromatography. B, Biomedical applications.

[60]  Jacky L. Snoep,et al.  BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems , 2005, Nucleic Acids Res..

[61]  Awad Aubad,et al.  Towards a framework building for social systems modelling , 2020 .