Toward the automated generation of genome-scale metabolic networks in the SEED

BackgroundCurrent methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process.ResultsWe have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis). We have implemented our tools and database within the SEED, an open-source software environment for comparative genome annotation and analysis.ConclusionOur method sets the stage for the automated generation of substantially complete metabolic networks for over 400 complete genome sequences currently in the SEED. With each genome that is processed using our tools, the database of common components grows to cover more of the diversity of metabolic pathways. This increases the likelihood that components of reaction networks for subsequently processed genomes can be retrieved from the database, rather than assembled and verified manually.

[1]  Jochen Förster,et al.  Modeling Lactococcus lactis using a genome-scale flux model , 2005, BMC Microbiology.

[2]  C. Francke,et al.  Reconstructing the metabolic network of a bacterium from its genome. , 2005, Trends in microbiology.

[3]  Pierre N. Robillard,et al.  Modeling and Simulation of Molecular Biology Systems Using Petri Nets: Modeling Goals of Various Approaches , 2004, J. Bioinform. Comput. Biol..

[4]  Jibin Sun,et al.  IdentiCS – Identification of coding sequence and in silico reconstruction of the metabolic network directly from unannotated low-coverage bacterial genome sequence , 2004, BMC Bioinformatics.

[5]  R. Overbeek,et al.  Missing genes in metabolic pathways: a comparative genomics approach. , 2003, Current opinion in chemical biology.

[6]  Peter D. Karp,et al.  MetaCyc: a multiorganism database of metabolic pathways and enzymes. , 2004, Nucleic acids research.

[7]  Tyrrell Conway,et al.  Metabolic genomics. , 2005, Advances in microbial physiology.

[8]  J. Pinney,et al.  metaSHARK: software for automated metabolic network prediction from DNA sequence and its application to the genomes of Plasmodium falciparum and Eimeria tenella , 2005, Nucleic acids research.

[9]  Daniel Segrè,et al.  From annotated genomes to metabolic flux models and kinetic parameter fitting. , 2003, Omics : a journal of integrative biology.

[10]  Adam M. Feist,et al.  Modeling methanogenesis with a genome‐scale metabolic reconstruction of Methanosarcina barkeri , 2006 .

[11]  Naryttza N. Diaz,et al.  The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes , 2005, Nucleic acids research.

[12]  Peter D. Karp,et al.  A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases , 2004, BMC Bioinformatics.

[13]  B O Palsson,et al.  Metabolic modeling of microbial strains in silico. , 2001, Trends in biochemical sciences.

[14]  V. N. Reddy,et al.  Qualitative analysis of biochemical reaction systems , 1996, Comput. Biol. Medicine.

[15]  Barbara Di Ventura,et al.  From in vivo to in silico biology and back , 2006, Nature.

[16]  Peter D. Karp,et al.  MetaCyc: a multiorganism database of metabolic pathways and enzymes , 2005, Nucleic Acids Res..

[17]  Monika Heiner,et al.  Application of Petri net theory for modelling and validation of the sucrose breakdown pathway in the potato tuber , 2005, Bioinform..

[18]  Gert Vriend,et al.  Correcting ligands, metabolites, and pathways , 2006, BMC Bioinformatics.

[19]  Markus J. Herrgård,et al.  Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. , 2004, Genome research.

[20]  Gong-Xin Yu,et al.  Ruleminer: a Knowledge System for Supporting High-throughput Protein Function Annotations , 2004, J. Bioinform. Comput. Biol..

[21]  Bernhard Palsson,et al.  Two-dimensional annotation of genomes , 2004, Nature Biotechnology.

[22]  G. Church,et al.  Genome-Scale Metabolic Model of Helicobacter pylori 26695 , 2002, Journal of bacteriology.

[23]  Peter D. Karp,et al.  The Pathway Tools software , 2002, ISMB.

[24]  Kenneth J. Kauffman,et al.  Advances in flux balance analysis. , 2003, Current opinion in biotechnology.

[25]  Masaru Tomita,et al.  GEM System: automatic prototyping of cell-wide metabolic pathway models from genomes , 2006, BMC Bioinformatics.

[26]  Steffen Klamt,et al.  FluxAnalyzer: exploring structure, pathways, and flux distributions in metabolic networks on interactive flux maps , 2003, Bioinform..

[27]  B. Palsson,et al.  An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR) , 2003, Genome Biology.

[28]  Erwin P. Gianchandani,et al.  Flux balance analysis in the era of metabolomics , 2006, Briefings Bioinform..

[29]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[30]  B. Palsson,et al.  Toward Metabolic Phenomics: Analysis of Genomic Data Using Flux Balances , 1999, Biotechnology progress.

[31]  B. Palsson,et al.  Expanded Metabolic Reconstruction of Helicobacter pylori (iIT341 GSM/GPR): an In Silico Genome-Scale Characterization of Single- and Double-Deletion Mutants , 2005, Journal of bacteriology.

[32]  Anne Kümmel,et al.  In silico genome-scale reconstruction and validation of the Staphylococcus aureus metabolic network. , 2005, Biotechnology and bioengineering.

[33]  B. Palsson,et al.  Genome-scale models of microbial cells: evaluating the consequences of constraints , 2004, Nature Reviews Microbiology.

[34]  B. Palsson,et al.  Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. , 2003, Genome research.

[35]  Bas Teusink,et al.  Accelerating the reconstruction of genome-scale metabolic networks , 2006, BMC Bioinformatics.

[36]  Bas Teusink,et al.  In Silico Reconstruction of the Metabolic Pathways of Lactobacillus plantarum: Comparing Predictions of Nutrient Requirements with Those from Growth Experiments , 2005, Applied and Environmental Microbiology.

[37]  B. Palsson,et al.  Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: an initial draft to the two-dimensional annotation , 2005, BMC Microbiology.

[38]  D R Westhead,et al.  Petri Net representations in systems biology. , 2003, Biochemical Society transactions.