From DNA to FBA: How to Build Your Own Genome-Scale Metabolic Model

Microbiological studies are increasingly relying on in silico methods to perform exploration and rapid analysis of genomic data, and functional genomics studies are supplemented by the new perspectives that genome-scale metabolic models offer. A mathematical model consisting of a microbe’s entire metabolic map can be rapidly determined from whole-genome sequencing and annotating the genomic material encoded in its DNA. Flux-balance analysis (FBA), a linear programming technique that uses metabolic models to predict the phenotypic responses imposed by environmental elements and factors, is the leading method to simulate and manipulate cellular growth in silico. However, the process of creating an accurate model to use in FBA consists of a series of steps involving a multitude of connections between bioinformatics databases, enzyme resources, and metabolic pathways. We present the methodology and procedure to obtain a metabolic model using PyFBA, an extensible Python-based open-source software package aimed to provide a platform where functional annotations are used to build metabolic models (http://linsalrob.github.io/PyFBA). Backed by the Model SEED biochemistry database, PyFBA contains methods to reconstruct a microbe’s metabolic map, run FBA upon different media conditions, and gap-fill its metabolism. The extensibility of PyFBA facilitates novel techniques in creating accurate genome-scale metabolic models.

[1]  Nagasuma R. Chandra,et al.  Flux balance analysis of biological systems: applications and challenges , 2009, Briefings Bioinform..

[2]  Edward J. O'Brien,et al.  Genome-scale models of metabolism and gene expression extend and refine growth phenotype prediction , 2013, Molecular systems biology.

[3]  Forest Rohwer,et al.  Elucidating genomic gaps using phenotypic profiles , 2014 .

[4]  George M. Church,et al.  Filling gaps in a metabolic network using expression information , 2004, ISMB/ECCB.

[5]  Ronan M. T. Fleming,et al.  Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0 , 2007, Nature Protocols.

[6]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[7]  Fangfang Xia,et al.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST) , 2013, Nucleic Acids Res..

[8]  Dong-Yup Lee,et al.  Software applications for flux balance analysis , 2014, Briefings Bioinform..

[9]  Antje Chang,et al.  BRENDA, enzyme data and metabolic information , 2002, Nucleic Acids Res..

[10]  Nathan D. Price,et al.  Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models , 2014, PLoS Comput. Biol..

[11]  Yan Zhang,et al.  PATRIC, the bacterial bioinformatics database and analysis resource , 2013, Nucleic Acids Res..

[12]  M. Saier,et al.  A major superfamily of transmembrane facilitators that catalyse uniport, symport and antiport. , 1993, Trends in biochemical sciences.

[13]  Juan Miguel García-Gómez,et al.  BIOINFORMATICS APPLICATIONS NOTE Sequence analysis Manipulation of FASTQ data with Galaxy , 2005 .

[14]  Edward J. O'Brien,et al.  Reconstruction and modeling protein translocation and compartmentalization in Escherichia coli at the genome-scale , 2014, BMC Systems Biology.

[15]  Joshua A. Lerman,et al.  COBRApy: COnstraints-Based Reconstruction and Analysis for Python , 2013, BMC Systems Biology.

[16]  Andrew D Hanson,et al.  Frontiers in metabolic reconstruction and modeling of plant genomes. , 2012, Journal of experimental botany.

[17]  Bernhard O. Palsson,et al.  Solving Puzzles With Missing Pieces: The Power of Systems Biology [Point of View] , 2016, Proc. IEEE.

[18]  Ines Thiele,et al.  Computationally efficient flux variability analysis , 2010, BMC Bioinformatics.

[19]  Vinay Satish Kumar,et al.  GrowMatch: An Automated Method for Reconciling In Silico/In Vivo Growth Predictions , 2009, PLoS Comput. Biol..

[20]  Ilias Tagkopoulos,et al.  An integrative, multi-scale, genome-wide model reveals the phenotypic landscape of Escherichia coli , 2014, Molecular systems biology.

[21]  Steffen Klamt,et al.  An application programming interface for CellNetAnalyzer , 2011, Biosyst..

[22]  Torsten Seemann,et al.  Prokka: rapid prokaryotic genome annotation , 2014, Bioinform..

[23]  Rick L. Stevens,et al.  The RAST Server: Rapid Annotations using Subsystems Technology , 2008, BMC Genomics.

[24]  Fangfang Xia,et al.  RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes , 2015, Scientific Reports.

[25]  Natalia Ivanova,et al.  The ERGOTM genome analysis and discovery system , 2003, Nucleic Acids Res..

[26]  B. Palsson,et al.  The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Duane Szafron,et al.  BASys: a web server for automated bacterial genome annotation , 2005, Nucleic Acids Res..

[28]  N. Price,et al.  Probabilistic integrative modeling of genome-scale metabolic and regulatory networks in Escherichia coli and Mycobacterium tuberculosis , 2010, Proceedings of the National Academy of Sciences.

[29]  John Gould,et al.  Toward the automated generation of genome-scale metabolic networks in the SEED , 2007, BMC Bioinformatics.

[30]  A. Burgard,et al.  Minimal Reaction Sets for Escherichia coli Metabolism under Different Growth Requirements and Uptake Environments , 2001, Biotechnology progress.

[31]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..

[32]  A. Burgard,et al.  Optimization-based framework for inferring and testing hypothesized metabolic objective functions. , 2003, Biotechnology and bioengineering.

[33]  Vinay Satish Kumar,et al.  Optimization based automated curation of metabolic reconstructions , 2007, BMC Bioinformatics.

[34]  Andreas Wagner,et al.  The Systems Biology Research Tool: evolvable open-source software , 2008, BMC Systems Biology.

[35]  Jeffrey D Orth,et al.  What is flux balance analysis? , 2010, Nature Biotechnology.

[36]  Peter Salamon,et al.  Phage Phenomics: Physiological Approaches to Characterize Novel Viral Proteins , 2015, Journal of visualized experiments : JoVE.

[37]  T. Shlomi,et al.  MIRAGE: a functional genomics-based approach for metabolic network model reconstruction and its application to cyanobacteria networks , 2012, Genome Biology.

[38]  Eiji Oki,et al.  GLPK (GNU Linear Programming Kit) , 2012 .

[39]  Bernhard O. Palsson,et al.  Identification of Genome-Scale Metabolic Network Models Using Experimentally Measured Flux Profiles , 2006, PLoS Comput. Biol..

[40]  Natalia Maltsev,et al.  WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction , 2000, Nucleic Acids Res..

[41]  Ron D. Appel,et al.  ExPASy: the proteomics server for in-depth protein knowledge and analysis , 2003, Nucleic Acids Res..

[42]  Christopher S. Henry,et al.  Long-term phenotypic evolution of bacteria , 2014, Nature.

[43]  B. Palsson The challenges of in silico biology , 2000, Nature Biotechnology.

[44]  Kenneth J. Kauffman,et al.  Advances in flux balance analysis. , 2003, Current opinion in biotechnology.

[45]  Gene calling and bacterial genome annotation with BG7. , 2015, Methods in molecular biology.

[46]  Jennifer L Reed,et al.  Software platforms to facilitate reconstructing genome-scale metabolic networks. , 2014, Environmental microbiology.

[47]  Erwin P. Gianchandani,et al.  Flux balance analysis in the era of metabolomics , 2006, Briefings Bioinform..

[48]  Margaret Fisher Application programming interface , 2006 .

[49]  R. Overbeek,et al.  Automated genome annotation and metabolic model reconstruction in the SEED and Model SEED. , 2013, Methods in molecular biology.

[50]  M. Saier,et al.  Computer-aided analyses of transport protein sequences: gleaning evidence concerning function, structure, biogenesis, and evolution , 1994, Microbiological reviews.

[51]  E. Webb Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. , 1992 .

[52]  Jason A. Papin,et al.  Genome-scale microbial in silico models: the constraints-based approach. , 2003, Trends in biotechnology.

[53]  Andreas Hoppe,et al.  FASIMU: flexible software for flux-balance computation series in large metabolic networks , 2011, BMC Bioinformatics.

[54]  Jason A. Papin,et al.  Reconciliation of Genome-Scale Metabolic Reconstructions for Comparative Systems Analysis , 2011, PLoS Comput. Biol..

[55]  Peter D. Karp,et al.  The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases , 2015, Nucleic Acids Res..

[56]  B. Palsson,et al.  Metabolic Capabilities of Escherichia coli II. Optimal Growth Patterns , 1993 .

[57]  B. Palsson,et al.  Metabolic modelling of microbes: the flux-balance approach. , 2002, Environmental microbiology.