Frontiers in metabolic reconstruction and modeling of plant genomes.

A major goal of post-genomic biology is to reconstruct and model in silico the metabolic networks of entire organisms. Work on bacteria is well advanced, and is now under way for plants and other eukaryotes. Genome-scale modelling in plants is much more challenging than in bacteria. The challenges come from features characteristic of higher organisms (subcellular compartmentation, tissue differentiation) and also from the particular severity in plants of a general problem: genome content whose functions remain undiscovered. This problem results in thousands of genes for which no function is known ('undiscovered genome content') and hundreds of enzymatic and transport functions for which no gene is yet identified. The severity of the undiscovered genome content problem in plants reflects their genome size and complexity. To bring the challenges of plant genome-scale modelling into focus, we first summarize the current status of plant genome-scale models. We then highlight the challenges - and ways to address them - in three areas: identifying genes for missing processes, modelling tissues as opposed to single cells, and finding metabolic functions encoded by undiscovered genome content. We also discuss the emerging view that a significant fraction of undiscovered genome content encodes functions that counter damage to metabolites inflicted by spontaneous chemical reactions or enzymatic mistakes.

[1]  Mikhail S. Gelfand,et al.  Comparative Analysis of Regulatory Patterns in Bacterial Genomes , 2000, Briefings Bioinform..

[2]  R. Mahadevan,et al.  The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. , 2003, Metabolic engineering.

[3]  Neil Swainston,et al.  Integration of metabolic databases for the reconstruction of genome-scale metabolic networks , 2010, BMC Systems Biology.

[4]  F. M. Muller,et al.  On the relationship between chemical composition and digestibility in vivo of roughage. , 1970 .

[5]  Adam M. Feist,et al.  Reconstruction of biochemical networks in microorganisms , 2009, Nature Reviews Microbiology.

[6]  Peter D. Karp,et al.  The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases , 2007, Nucleic Acids Res..

[7]  A. Fernie The future of metabolic phytochemistry: larger numbers of metabolites, higher resolution, greater understanding. , 2007, Phytochemistry.

[8]  Kengo Kinoshita,et al.  Coexpression landscape in ATTED-II: usage of gene list and gene network for various types of pathways , 2010, Journal of Plant Research.

[9]  H. Poorter,et al.  Chemical composition of 24 wild species differing in relative growth rate , 1992 .

[10]  William S. Spector,et al.  Handbook of Biological Data , 1957, The Yale Journal of Biology and Medicine.

[11]  Brian Smith-White,et al.  A collection of plant-specific genomic data and resources at NCBI. , 2007, Methods in molecular biology.

[12]  André O. Hudson,et al.  l,l-diaminopimelate aminotransferase, a trans-kingdom enzyme shared by Chlamydia and plants for synthesis of diaminopimelate/lysine , 2006, Proceedings of the National Academy of Sciences.

[13]  David A. Lee,et al.  Predicting protein function from sequence and structure , 2007, Nature Reviews Molecular Cell Biology.

[14]  Ronan M. T. Fleming,et al.  Genome-Scale Reconstruction of Escherichia coli's Transcriptional and Translational Machinery: A Knowledge Base, Its Mathematical Formulation, and Its Functional Characterization , 2009, PLoS Comput. Biol..

[15]  Bernd Schneider,et al.  Plant micrometabolomics: the analysis of endogenous metabolites present in a plant cell or tissue. , 2009, Journal of proteome research.

[16]  G. Guinn Extraction of nucleic acids from lyophilized plant material. , 1966, Plant physiology.

[17]  B. Palsson,et al.  Large-scale in silico modeling of metabolic interactions between cell types in the human brain , 2010, Nature Biotechnology.

[18]  Karsten Suhre Inference of gene function based on gene fusion events: the rosetta-stone method. , 2007, Methods in molecular biology.

[19]  Erik van Nimwegen,et al.  Scaling laws in functional genome content across prokaryotic clades and lifestyles. , 2009, Trends in genetics : TIG.

[20]  P. Stover,et al.  The metabolic role of leucovorin. , 1993, Trends in biochemical sciences.

[21]  Peter D. Karp,et al.  Machine learning methods for metabolic pathway prediction , 2010 .

[22]  D. Vertommen,et al.  Identification of a dehydrogenase acting on D-2-hydroxyglutarate. , 2004, The Biochemical journal.

[23]  S. Clarke,et al.  Homocysteine Methyltransferases Mht1 and Sam4 Prevent the Accumulation of Age-damaged (R,S)-AdoMet in the Yeast Saccharomyces cerevisiae* , 2010, The Journal of Biological Chemistry.

[24]  Charles DeLisi,et al.  Identifying functional links between genes using conserved chromosomal proximity. , 2002, Trends in genetics : TIG.

[25]  L. Quek,et al.  AraGEM, a Genome-Scale Reconstruction of the Primary Metabolic Network in Arabidopsis1[W] , 2009, Plant Physiology.

[26]  Monica L. Mo,et al.  Global reconstruction of the human metabolic network based on genomic and bibliomic data , 2007, Proceedings of the National Academy of Sciences.

[27]  E. Schaftingen,et al.  l-2-Hydroxyglutaric aciduria, a disorder of metabolite repair , 2009, Journal of Inherited Metabolic Disease.

[28]  Vinay Satish Kumar,et al.  Optimization based automated curation of metabolic reconstructions , 2007, BMC Bioinformatics.

[29]  A. Golubev The Other Side of Metabolism: a Review , 1996 .

[30]  R. Viola,et al.  Alteration of the Specificity of Malate Dehydrogenase by Chemical Modulation of an Active Site Arginine* , 2001, The Journal of Biological Chemistry.

[31]  Gert B. Eijkel,et al.  Differential chemical allocation and plant adaptation: A Py-MS Study of 24 species differing in relative growth rate , 1995, Plant and Soil.

[32]  Rick L. Stevens,et al.  High-throughput generation, optimization and analysis of genome-scale metabolic models , 2010, Nature Biotechnology.

[33]  Dan S. Tawfik Messy biology and the origins of evolutionary innovations. , 2010, Nature chemical biology.

[34]  L. Aravind Guilt by association: contextual information in genome analysis. , 2000, Genome research.

[35]  Eleftherios Pilalis,et al.  An in silico compartmentalized metabolic model of Brassica napus enables the systemic study of regulatory aspects of plant central metabolism , 2011, Biotechnology and bioengineering.

[36]  M. Wajner,et al.  D‐2‐hydroxyglutaric acid induces oxidative stress in cerebral cortex of young rats , 2003, The European journal of neuroscience.

[37]  B. Usadel,et al.  Ribosome and transcript copy numbers, polysome occupancy and enzyme dynamics in Arabidopsis , 2009, Molecular systems biology.

[38]  M. Winkler,et al.  A novel alpha-ketoglutarate reductase activity of the serA-encoded 3-phosphoglycerate dehydrogenase of Escherichia coli K-12 and its possible implications for human 2-hydroxyglutaric aciduria , 1996, Journal of bacteriology.

[39]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[40]  L. Quek,et al.  C4GEM, a Genome-Scale Metabolic Model to Study C4 Plant Metabolism1[W][OA] , 2010, Plant Physiology.

[41]  J. Gregory,et al.  A Nudix Enzyme Removes Pyrophosphate from Dihydroneopterin Triphosphate in the Folate Synthesis Pathway of Bacteria and Plants* , 2005, Journal of Biological Chemistry.

[42]  S. Clarke,et al.  Recognition of Age-damaged (R,S)-Adenosyl-L-methionine by Two Methyltransferases in the Yeast Saccharomyces cerevisiae* , 2007, Journal of Biological Chemistry.

[43]  Markus J. Herrgård,et al.  Network-based prediction of human tissue-specific metabolism , 2008, Nature Biotechnology.

[44]  Pankaj Jaiswal,et al.  Gramene database: a hub for comparative plant genomics. , 2011, Methods in molecular biology.

[45]  Eytan Ruppin,et al.  Network-based prediction of metabolic enzymes' subcellular localization , 2009, Bioinform..

[46]  L. Valledor,et al.  Plant proteomics update (2007-2008): Second-generation proteomic techniques, an appropriate experimental design, and data analysis to fulfill MIAPE standards, increase plant proteome coverage and expand biological knowledge. , 2009, Journal of proteomics.

[47]  Adam M. Feist,et al.  The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli , 2008, Nature Biotechnology.

[48]  William S. Spector Handbook of Biological Data , 1956 .

[49]  A. Bairoch,et al.  The Swiss-Prot protein knowledgebase and ExPASy: providing the plant community with high quality proteomic data and tools. , 2004, Plant physiology and biochemistry : PPB.

[50]  R. Hausinger,et al.  Identification of Escherichia coli YgaF as an l-2-Hydroxyglutarate Oxidase , 2008, Journal of bacteriology.

[51]  A. Böck,et al.  S-Methylmethionine Metabolism in Escherichia coli , 1999, Journal of bacteriology.

[52]  Peter D. Karp,et al.  MetaCyc and AraCyc. Metabolic Pathway Databases for Plant Research1[w] , 2005, Plant Physiology.

[53]  Adam M. Feist,et al.  A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information , 2007, Molecular systems biology.

[54]  Rick Stevens,et al.  Essential genes on metabolic maps. , 2006, Current opinion in biotechnology.

[55]  C. Maranas,et al.  Zea mays iRS1563: A Comprehensive Genome-Scale Metabolic Reconstruction of Maize Metabolism , 2011, PloS one.

[56]  A. Hanson,et al.  C Tracer Evidence for Synthesis of Choline and Betaine via Phosphoryl Base Intermediates in Salinized Sugarbeet Leaves. , 1983, Plant physiology.

[57]  E. Holme,et al.  IDH2 Mutations in Patients with d-2-Hydroxyglutaric Aciduria , 2010, Science.

[58]  Joshua L. Heazlewood,et al.  SUBA: the Arabidopsis Subcellular Database , 2006, Nucleic Acids Res..

[59]  E. Sonnhammer,et al.  Genomic gene clustering analysis of pathways in eukaryotes. , 2003, Genome research.

[60]  V. de Crécy-Lagard,et al.  Moonlighting Glutamate Formiminotransferases Can Functionally Replace 5-Formyltetrahydrofolate Cycloligase* , 2010, The Journal of Biological Chemistry.

[61]  B. Haas,et al.  Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology , 2006, BMC Genomics.

[62]  Jeremy M Berg,et al.  Update on the protein structure initiative. , 2007, Structure.

[63]  Y. Tani,et al.  The conversion of bisnorbiotin and bisnordethiobiotin to biotin and dethiobiotin, respectively, by microorganisms. , 1973, Biochimica et biophysica acta.

[64]  Arthur J. L. Cooper The role of glutamine transaminase K (GTK) in sulfur and α-keto acid metabolism in the brain, and in the possible bioactivation of neurotoxicants , 2004, Neurochemistry International.

[65]  B. Palsson,et al.  A protocol for generating a high-quality genome-scale metabolic reconstruction , 2010 .

[66]  D. Fell,et al.  A Genome-Scale Metabolic Model of Arabidopsis and Some of Its Properties1[C][W] , 2009, Plant Physiology.

[67]  Janet M Thornton,et al.  The complement of enzymatic sets in different species. , 2005, Journal of molecular biology.

[68]  Chris Somerville,et al.  Plant Biology in 2010 , 2000, Science.

[69]  E. Vincze,et al.  A pathway-specific microarray analysis highlights the complex and co-ordinated transcriptional networks of the developing grain of field-grown barley , 2008, Journal of experimental botany.

[70]  Peter D. Karp,et al.  Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology , 2015, Briefings Bioinform..

[71]  R. Quatrano Genomics , 1998, Plant Cell.

[72]  Antje Chang,et al.  BRENDA, the enzyme information system in 2011 , 2010, Nucleic Acids Res..

[73]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[74]  A. Guranowski,et al.  Protective Mechanisms against Homocysteine Toxicity , 2006, Journal of Biological Chemistry.

[75]  V. Gladyshev,et al.  The biological significance of methionine sulfoxide stereochemistry. , 2011, Free radical biology & medicine.

[76]  T. Brutnell,et al.  Exploring plant transcriptomes using ultra high-throughput sequencing. , 2010, Briefings in functional genomics.

[77]  Robert D. Finn,et al.  DUFs: families in search of function , 2010, Acta crystallographica. Section F, Structural biology and crystallization communications.

[78]  I-Min A. Chen,et al.  The integrated microbial genomes system: an expanding comparative analysis resource , 2009, Nucleic Acids Res..

[79]  F. Bossa,et al.  Structure and mechanism of Escherichia coli pyridoxine 5'-phosphate oxidase. , 2003, Biochimica et biophysica acta.

[80]  C. Somerville,et al.  Genomics. Plant biology in 2010. , 2000, Science.

[81]  Patricia C. Babbitt,et al.  Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies , 2009, PLoS Comput. Biol..

[82]  T. Romeis,et al.  Ureide catabolism in Arabidopsis thaliana and Escherichia coli. , 2010, Nature chemical biology.

[83]  N. Krogan,et al.  Phenotypic Landscape of a Bacterial Cell , 2011, Cell.

[84]  Dan S. Tawfik,et al.  Enzyme promiscuity: a mechanistic and evolutionary perspective. , 2010, Annual review of biochemistry.

[85]  H. H. Laar,et al.  Products, requirements and efficiency of biosynthesis: a quantitative approach. , 1974, Journal of theoretical biology.

[86]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[87]  Rick L. Stevens,et al.  The RAST Server: Rapid Annotations using Subsystems Technology , 2008, BMC Genomics.

[88]  Edward S. Buckler,et al.  Gramene database in 2010: updates and extensions , 2010, Nucleic Acids Res..

[89]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[90]  J. Gregory,et al.  Higher Plant Plastids and Cyanobacteria Have Folate Carriers Related to Those of Trypanosomatids* , 2005, Journal of Biological Chemistry.

[91]  K. Moore,et al.  Hemicellulose monosaccharide composition and in vitro disappearance of orchard grass and alfalfa hay , 1987 .

[92]  V. de Crécy-Lagard,et al.  'Unknown' proteins and 'orphan' enzymes: the missing half of the engineering parts list--and how to find it. , 2009, The Biochemical journal.

[93]  Jeffrey D Orth,et al.  What is flux balance analysis? , 2010, Nature Biotechnology.

[94]  Björn H. Junker,et al.  Flux Balance Analysis of Barley Seeds: A Computational Approach to Study Systemic Properties of Central Metabolism1[W] , 2008, Plant Physiology.

[95]  Dmitrij Frishman Protein Annotation at Genomic Scale: The Current Status , 2007 .

[96]  E. Ruppin,et al.  Reconstruction of Arabidopsis metabolic network models accounting for subcellular compartmentalization and tissue-specificity , 2011, Proceedings of the National Academy of Sciences.

[97]  N. Ohishi,et al.  FORMATION OF "THIAMINOSUCCINIC ACID" AS AN INTERMEDIATE IN THE TRANSFORMATION OF OXYTHIAMINE TO THIAMINE BY A THIAMINELESS MUTANT OF ESCHERICHIA COLI. , 1965, The Journal of biological chemistry.

[98]  A. Osbourn Gene Clusters for Secondary Metabolic Pathways: An Emerging Theme in Plant Biology1 , 2010, Plant Physiology.

[99]  E. Ruppin,et al.  Computational reconstruction of tissue-specific metabolic models: application to human liver metabolism , 2010, Molecular systems biology.

[100]  Michael Y. Galperin,et al.  From complete genome sequence to 'complete' understanding? , 2010, Trends in biotechnology.

[101]  R. Breaker,et al.  Regulation of bacterial gene expression by riboswitches. , 2005, Annual review of microbiology.

[102]  L. Liau,et al.  Cancer-associated IDH1 mutations produce 2-hydroxyglutarate , 2009, Nature.

[103]  A. Hanson,et al.  Cultivated and wild rices do not accumulate glycinebetaine due to deficiencies in two biosynthetic steps , 1993 .

[104]  Steffen Lemke,et al.  AraPerox. A Database of Putative Arabidopsis Proteins from Plant Peroxisomes1[w] , 2004, Plant Physiology.

[105]  Olga Brazhnik,et al.  The Arabidopsis SeedGenes Project , 2003, Nucleic Acids Res..

[106]  D. Oliver THE GLYCINE DECARBOXYLASE COMPLEX FROM PLANT MITOCHONDRIA , 1994 .

[107]  R. Overbeek,et al.  The use of gene clusters to infer functional coupling. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[108]  V. de Crécy-Lagard,et al.  Finding novel metabolic genes through plant-prokaryote phylogenomics. , 2007, Trends in microbiology.

[109]  P. Karp,et al.  Creation of a Genome-Wide Metabolic Pathway Database for Populus trichocarpa Using a New Approach for Reconstruction and Curation of Metabolic Pathways for Plants1[W][OA] , 2010, Plant Physiology.

[110]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[111]  A. Osterman,et al.  Comparative genomics and functional analysis of the NiaP family uncover nicotinate transporters from bacteria, plants, and mammals , 2011, Functional & Integrative Genomics.

[112]  Qi Sun,et al.  PPDB, the Plant Proteomics Database at Cornell , 2008, Nucleic Acids Res..

[113]  A. Campbell,et al.  Cloning and nucleotide sequence of bisC, the structural gene for biotin sulfoxide reductase in Escherichia coli , 1990, Journal of bacteriology.

[114]  Y. Shachar-Hill,et al.  Metabolic flux analysis in plants: coping with complexity. , 2009, Plant, cell & environment.

[115]  R. Overbeek,et al.  Missing genes in metabolic pathways: a comparative genomics approach. , 2003, Current opinion in chemical biology.

[116]  Naryttza N. Diaz,et al.  The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes , 2005, Nucleic acids research.

[117]  J. L. Hoffman Chromatographic analysis of the chiral and covalent instability of S-adenosyl-L-methionine. , 1986, Biochemistry.