GrowMatch: An Automated Method for Reconciling In Silico/In Vivo Growth Predictions

Genome-scale metabolic reconstructions are typically validated by comparing in silico growth predictions across different mutants utilizing different carbon sources with in vivo growth data. This comparison results in two types of model-prediction inconsistencies; either the model predicts growth when no growth is observed in the experiment (GNG inconsistencies) or the model predicts no growth when the experiment reveals growth (NGG inconsistencies). Here we propose an optimization-based framework, GrowMatch, to automatically reconcile GNG predictions (by suppressing functionalities in the model) and NGG predictions (by adding functionalities to the model). We use GrowMatch to resolve inconsistencies between the predictions of the latest in silico Escherichia coli (iAF1260) model and the in vivo data available in the Keio collection and improved the consistency of in silico with in vivo predictions from 90.6% to 96.7%. Specifically, we were able to suggest consistency-restoring hypotheses for 56/72 GNG mutants and 13/38 NGG mutants. GrowMatch resolved 18 GNG inconsistencies by suggesting suppressions in the mutant metabolic networks. Fifteen inconsistencies were resolved by suppressing isozymes in the metabolic network, and the remaining 23 GNG mutants corresponding to blocked genes were resolved by suitably modifying the biomass equation of iAF1260. GrowMatch suggested consistency-restoring hypotheses for five NGG mutants by adding functionalities to the model whereas the remaining eight inconsistencies were resolved by pinpointing possible alternate genes that carry out the function of the deleted gene. For many cases, GrowMatch identified fairly nonintuitive model modification hypotheses that would have been difficult to pinpoint through inspection alone. In addition, GrowMatch can be used during the construction phase of new, as opposed to existing, genome-scale metabolic models, leading to more expedient and accurate reconstructions.

[1]  Peter D. Karp,et al.  EcoCyc: a comprehensive database resource for Escherichia coli , 2004, Nucleic Acids Res..

[2]  Andrey A Mironov,et al.  A metabolic network in the evolutionary context: multiscale structure and modularity. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[3]  S. Ehrlich,et al.  Essential Bacillus subtilis genes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Jae-Hoon Song,et al.  Identification of essential genes in Streptococcus pneumoniae by allelic replacement mutagenesis. , 2005, Molecules and cells.

[5]  J. W. Campbell,et al.  Experimental Determination and System Level Analysis of Essential Genes in Escherichia coli MG1655 , 2003, Journal of bacteriology.

[6]  Peter D. Karp,et al.  MetaCyc: a multiorganism database of metabolic pathways and enzymes , 2005, Nucleic Acids Res..

[7]  George M. Church,et al.  Filling gaps in a metabolic network using expression information , 2004, ISMB/ECCB.

[8]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[9]  B. Palsson,et al.  Towards multidimensional genome annotation , 2006, Nature Reviews Genetics.

[10]  Matthew D. Jankowski,et al.  Genome-scale thermodynamic analysis of Escherichia coli metabolism. , 2006, Biophysical journal.

[11]  Ronald W. Davis,et al.  Functional profiling of the Saccharomyces cerevisiae genome , 2002, Nature.

[12]  Adam M. Feist,et al.  A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information , 2007, Molecular systems biology.

[13]  C. Hutchison,et al.  Essential genes of a minimal bacterium. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Ian T. Paulsen,et al.  Comparative Analyses of Fundamental Differences in Membrane Transport Capabilities in Prokaryotes and Eukaryotes , 2005, PLoS Comput. Biol..

[15]  Egon Balas,et al.  Integer Programming , 2021, Encyclopedia of Optimization.

[16]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[17]  Masanori Arita,et al.  Metabolic reconstruction using shortest paths , 2000, Simul. Pract. Theory.

[18]  Yoav Freund,et al.  Identifying metabolic enzymes with multiple types of association evidence , 2006, BMC Bioinformatics.

[19]  Yuji Kohara,et al.  Large-scale analysis of gene function in Caenorhabditis elegans by high-throughput RNAi , 2001, Current Biology.

[20]  Andrew R. Joyce,et al.  Experimental and Computational Assessment of Conditionally Essential Genes in Escherichia coli , 2006, Journal of bacteriology.

[21]  A. Barabasi,et al.  Predicting synthetic rescues in metabolic networks , 2008, Molecular systems biology.

[22]  Erwin P. Gianchandani,et al.  Predicting biological system objectives de novo from internal state measurements , 2008, BMC Bioinformatics.

[23]  B. Palsson,et al.  Systems approach to refining genome annotation , 2006, Proceedings of the National Academy of Sciences.

[24]  S. Oliver,et al.  Plasticity of genetic interactions in metabolic networks of yeast , 2007, Proceedings of the National Academy of Sciences.

[25]  Bernhard O. Palsson,et al.  Three factors underlying incorrect in silico predictions of essential metabolic genes , 2015 .

[26]  H. Mori,et al.  Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection , 2006, Molecular systems biology.

[27]  Vinay Satish Kumar,et al.  Optimization based automated curation of metabolic reconstructions , 2007, BMC Bioinformatics.

[28]  Chunhui Li,et al.  Exploring the diversity of complex metabolic networks , 2005, Bioinform..

[29]  Johann Gasteiger,et al.  Computer‐Assisted Planning of Organic Syntheses: The Second Generation of Programs , 1996 .

[30]  B. Palsson,et al.  In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data , 2001, Nature Biotechnology.

[31]  Gilles Klopman,et al.  META 4. Prediction of the metabolism of polycyclic aromatic hydrocarbons , 1999 .

[32]  C. Schilling,et al.  Flux coupling analysis of genome-scale metabolic network reconstructions. , 2004, Genome research.

[33]  B. Palsson,et al.  Escherichia coli K-12 undergoes adaptive evolution to achieve in silico predicted optimal growth , 2002, Nature.

[34]  E. Ruppin,et al.  Regulatory on/off minimization of metabolic flux changes after genetic perturbations. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Markus J. Herrgård,et al.  Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. , 2004, Genome research.

[36]  Nina L Tuite,et al.  Homocysteine Toxicity in Escherichia coli Is Caused by a Perturbation of Branched-Chain Amino Acid Biosynthesis , 2005, Journal of bacteriology.

[37]  Peter D. Karp,et al.  A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases , 2004, BMC Bioinformatics.

[38]  Markus J. Herrgård,et al.  Integrating high-throughput and computational data elucidates bacterial networks , 2004, Nature.

[39]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[40]  G. Church,et al.  Analysis of optimality in natural and perturbed metabolic networks , 2002 .

[41]  D. Vitkup,et al.  Predicting genes for orphan metabolic activities using phylogenetic profiles , 2006, Genome Biology.

[42]  B. Palsson,et al.  An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR) , 2003, Genome Biology.

[43]  B. Palsson,et al.  The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[44]  N. Perrimon,et al.  Genome-Wide RNAi Analysis of Growth and Viability in Drosophila Cells , 2004, Science.

[45]  B. Dougherty,et al.  Identification of 113 conserved essential genes using a high-throughput gene disruption system in Streptococcus pneumoniae. , 2002, Nucleic acids research.

[46]  Monica L. Mo,et al.  Global reconstruction of the human metabolic network based on genomic and bibliomic data , 2007, Proceedings of the National Academy of Sciences.

[47]  E. Rubin,et al.  Comprehensive identification of conditionally essential genes in mycobacteria , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[48]  R. Overbeek,et al.  Missing genes in metabolic pathways: a comparative genomics approach. , 2003, Current opinion in chemical biology.