Pantograph: A template-based method for genome-scale metabolic model reconstruction

Genome-scale metabolic models are a powerful tool to study the inner workings of biological systems and to guide applications. The advent of cheap sequencing has brought the opportunity to create metabolic maps of biotechnologically interesting organisms. While this drives the development of new methods and automatic tools, network reconstruction remains a time-consuming process where extensive manual curation is required. This curation introduces specific knowledge about the modeled organism, either explicitly in the form of molecular processes, or indirectly in the form of annotations of the model elements. Paradoxically, this knowledge is usually lost when reconstruction of a different organism is started. We introduce the Pantograph method for metabolic model reconstruction. This method combines a template reaction knowledge base, orthology mappings between two organisms, and experimental phenotypic evidence, to build a genome-scale metabolic model for a target organism. Our method infers implicit knowledge from annotations in the template, and rewrites these inferences to include them in the resulting model of the target organism. The generated model is well suited for manual curation. Scripts for evaluating the model with respect to experimental data are automatically generated, to aid curators in iterative improvement. We present an implementation of the Pantograph method, as a toolbox for genome-scale model reconstruction, curation and validation. This open source package can be obtained from: http://pathtastic.gforge.inria.fr.

[1]  Jibin Sun,et al.  IdentiCS – Identification of coding sequence and in silico reconstruction of the metabolic network directly from unannotated low-coverage bacterial genome sequence , 2004, BMC Bioinformatics.

[2]  Edda Klipp,et al.  Annotation and merging of SBML models with semanticSBML , 2010, Bioinform..

[3]  Samik Ghosh,et al.  Payao: a community platform for SBML pathway model curation , 2010, Bioinform..

[4]  Peter D. Karp,et al.  Evaluation of computational metabolic-pathway predictions for Helicobacter pylori , 2002, Bioinform..

[5]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[6]  Juho Rousu,et al.  Computational methods for metabolic reconstruction. , 2010, Current opinion in biotechnology.

[7]  Peter D. Karp,et al.  The Pathway Tools software , 2002, ISMB.

[8]  Christoph Steinbeck,et al.  Chemical Entities of Biological Interest: an update , 2009, Nucleic Acids Res..

[9]  B. Palsson,et al.  A protocol for generating a high-quality genome-scale metabolic reconstruction , 2010 .

[10]  Amos Bairoch,et al.  The ENZYME data bank in 1999 , 1999, Nucleic Acids Res..

[11]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[12]  Jean-Marc Nicaud,et al.  Yarrowia lipolytica as a model for bio-oil production. , 2009, Progress in lipid research.

[13]  Carlos Gancedo,et al.  Yarrowia lipolytica Mutants Devoid of Pyruvate Carboxylase Activity Show an Unusual Growth Phenotype , 2005, Eukaryotic Cell.

[14]  Adam M. Feist,et al.  The biomass objective function. , 2010, Current opinion in microbiology.

[15]  R. Overbeek,et al.  Missing genes in metabolic pathways: a comparative genomics approach. , 2003, Current opinion in chemical biology.

[16]  J. Nicaud,et al.  Characterization of Yarrowia lipolytica mutants affected in hydrophobic substrate utilization. , 2007, Fungal genetics and biology : FG & B.

[17]  Christian E. V. Storm,et al.  Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. , 2001, Journal of molecular biology.

[18]  N. Kikuchi,et al.  CellDesigner 3.5: A Versatile Modeling Tool for Biochemical Networks , 2008, Proceedings of the IEEE.

[19]  Michael Hucka,et al.  LibSBML: an API Library for SBML , 2008, Bioinform..

[20]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[21]  Bernhard O. Palsson,et al.  BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions , 2010, BMC Bioinformatics.

[22]  Ronan M. T. Fleming,et al.  Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0 , 2007, Nature Protocols.

[23]  Sang Yup Lee,et al.  Recent advances in reconstruction and applications of genome-scale metabolic models. , 2012, Current opinion in biotechnology.

[24]  Vinay Satish Kumar,et al.  GrowMatch: An Automated Method for Reconciling In Silico/In Vivo Growth Predictions , 2009, PLoS Comput. Biol..

[25]  Bas Teusink,et al.  Accelerating the reconstruction of genome-scale metabolic networks , 2006, BMC Bioinformatics.

[26]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[27]  Y Nagata,et al.  Isolation and characterization of acetoacetyl-CoA thiolase gene essential for n-decane assimilation in yeast Yarrowia lipolytica. , 2001, Biochemical and biophysical research communications.

[28]  Adrien Goëffon,et al.  Comparative genomics of protoploid Saccharomycetaceae. , 2009, Genome research.

[29]  Adam M. Feist,et al.  Reconstruction of biochemical networks in microorganisms , 2009, Nature Reviews Microbiology.

[30]  Bernhard Palsson,et al.  Two-dimensional annotation of genomes , 2004, Nature Biotechnology.

[31]  Jeffrey D Orth,et al.  What is flux balance analysis? , 2010, Nature Biotechnology.

[32]  Yoav Freund,et al.  Identifying metabolic enzymes with multiple types of association evidence , 2006, BMC Bioinformatics.

[33]  Michel Dumontier,et al.  Controlled vocabularies and semantics in systems biology , 2011, Molecular systems biology.

[34]  Nicolas Le Novère,et al.  Identifiers.org and MIRIAM Registry: community resources to provide persistent identification , 2011, Nucleic Acids Res..

[35]  David James Sherman,et al.  A genome-scale metabolic model of the lipid-accumulating yeast Yarrowia lipolytica , 2012, BMC Systems Biology.

[36]  Melanie I. Stefan,et al.  BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models , 2010, BMC Systems Biology.

[37]  Amos Bairoch,et al.  The ENZYME data bank in 1995 , 1996, Nucleic Acids Res..

[38]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[39]  John Gould,et al.  Toward the automated generation of genome-scale metabolic networks in the SEED , 2007, BMC Bioinformatics.

[40]  C. Francke,et al.  Reconstructing the metabolic network of a bacterium from its genome. , 2005, Trends in microbiology.

[41]  Intawat Nookaew,et al.  The RAVEN Toolbox and Its Use for Generating a Genome-scale Metabolic Model for Penicillium chrysogenum , 2013, PLoS Comput. Biol..

[42]  Markus J. Herrgård,et al.  A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology , 2008, Nature Biotechnology.

[43]  Jason A. Papin,et al.  Metabolic network reconstruction of Chlamydomonas offers insight into light-driven algal metabolism , 2011, Molecular systems biology.

[44]  Suzanne M. Paley,et al.  Integrated pathway/genome databases and their role in drug discovery , 1999 .

[45]  Vinay Satish Kumar,et al.  Optimization based automated curation of metabolic reconstructions , 2007, BMC Bioinformatics.