Chemoinformatics on Metabolic Pathways: Attaching Biochemical Information on Putative Enzymatic Reactions

Chemical genomics is one of the cutting-edge research areas in the post-genomic era, which requires a sophisticated integration of heterogeneous information, i.e., genomic and chemical information. Enzymes play key roles for dynamic behavior of living organisms, linking information in the chemical space and genomic space. In this chapter, the authors report our recent efforts in this area, including the development of a similarity measure between two chemical compounds, a prediction system of a plausible enzyme for a given substrate and product pair, and two different approaches to predict the fate of a given compound in a metabolic pathway. General problems and possible future directions are also discussed, in hope to attract more activities from many researchers in this research area. DOI: 10.4018/978-1-4666-3604-0.ch053

[1]  Peter Willett,et al.  Maximum common subgraph isomorphism algorithms for the matching of chemical structures , 2002, J. Comput. Aided Mol. Des..

[2]  David L. Grier,et al.  The Implementation of Atom-Atom Mapping and Related Features in the Reaction Access System (REACCS) , 1988 .

[3]  Susumu Goto,et al.  Prediction of glycan structures from gene expression data based on glycosyltransferase reactions , 2005, Bioinform..

[4]  P. Bork,et al.  Bioinformatics in the post-sequence era , 2003, Nature Genetics.

[5]  Lynda B. M. Ellis,et al.  Encoding microbial metabolic logic: predicting biodegradation , 2004, Journal of Industrial Microbiology & Biotechnology.

[6]  T. Ferenci,et al.  Effect of Slow Growth on Metabolism of Escherichia coli, as Revealed by Global Metabolite Pool (“Metabolome”) Analysis , 1998, Journal of bacteriology.

[7]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[8]  Minoru Kanehisa,et al.  Comprehensive analysis of distinctive polyketide and nonribosomal peptide structural motifs encoded in microbial genomes. , 2007, Journal of molecular biology.

[9]  Susumu Goto,et al.  LIGAND: database of chemical compounds and reactions in biological pathways , 2002, Nucleic Acids Res..

[10]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[11]  Susumu Goto,et al.  PathPred: an enzyme-catalyzed metabolic pathway prediction server , 2010, Nucleic Acids Res..

[12]  D. K. Friesen,et al.  A combinatorial algorithm for calculating ligand binding , 1984 .

[13]  Andrew G. McDonald,et al.  Eliciting Possible Reaction Equations and Metabolic Pathways Involving Orphan Metabolites , 2008, J. Chem. Inf. Model..

[14]  G. Bemis,et al.  Properties of known drugs. 2. Side chains. , 1999, Journal of medicinal chemistry.

[15]  Peter D. Karp,et al.  EcoCyc: A comprehensive view of Escherichia coli biology , 2008, Nucleic Acids Res..

[16]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[17]  Yoshihiro Yamanishi,et al.  E-zyme: predicting potential EC numbers from the chemical transformation pattern of substrate-product pairs , 2009, Bioinform..

[18]  M. Kanehisa A database for post-genome analysis. , 1997, Trends in genetics : TIG.

[19]  Shigenori Maeda,et al.  Automated recognition of common geometrical patterns among a variety of three-dimensional moleculars structures , 1987 .

[20]  A. K. Haghi Methodologies and Applications for Chemoinformatics and Chemical Engineering , 2013 .

[21]  Peter Willett,et al.  Use of a maximum common subgraph algorithm in the automatic identification of ostensible bond changes occurring in chemical reactions , 1981, J. Chem. Inf. Comput. Sci..

[22]  M. Kanehisa Prediction of higher order functional networks from genomic data. , 2001, Pharmacogenomics.

[23]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[24]  Anthony Long,et al.  Computer systems for the prediction of xenobiotic metabolism. , 2002, Advanced drug delivery reviews.

[25]  R. Rosipal Nonlinear Partial Least Squares An Overview , 2011 .

[26]  Susumu Goto,et al.  LIGAND: chemical database for enzyme reactions , 1998, Bioinform..

[27]  Peter Willett,et al.  Heuristics for Similarity Searching of Chemical Graphs Using a Maximum Common Edge Subgraph Algorithm , 2002, J. Chem. Inf. Comput. Sci..

[28]  F. Darvas,et al.  Predicting metabolic pathways by logic programming , 1988 .

[29]  Peter Willett,et al.  RASCAL: Calculation of Graph Similarity using Maximum Common Edge Subgraphs , 2002, Comput. J..

[30]  Gisbert Schneider,et al.  Brain-like Processing and Classification of Chemical Data: An Approach Inspired by the Sense of Smell , 2012 .

[31]  M. Kanehisa,et al.  Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. , 2003, Journal of the American Chemical Society.

[32]  Ramón García-Domenech,et al.  Application of Molecular Topology to the Prediction of Water Quality Indices of Alkylphenol Pollutants , 2011, Int. J. Chemoinformatics Chem. Eng..

[33]  William Lingran Chen,et al.  Over 20 Years of Reaction Access Systems from MDL: A Novel Reaction Substructure Search Algorithm , 2002, J. Chem. Inf. Comput. Sci..

[34]  Lynda B. M. Ellis,et al.  The University of Minnesota Biocatalysis/Biodegradation Database: the first decade , 2005, Nucleic Acids Res..

[35]  Susumu Goto,et al.  Extraction and Analysis of Chemical Modification Patterns in Drug Development , 2009, J. Chem. Inf. Model..

[36]  Huma Lodhi,et al.  Chemoinformatics and Advanced Machine Learning Perspectives: Complex Computational Methods and Collaborative Techniques , 2010 .

[37]  M. Kanehisa,et al.  Computational assignment of the EC numbers for genomic-scale analysis of enzymatic reactions. , 2004, Journal of the American Chemical Society.

[38]  D. Eisenberg,et al.  Protein function in the post-genomic era , 2000, Nature.

[39]  D. Hochstrasser,et al.  Progress with proteome projects: why all proteins expressed by a genome should be identified and how to do it. , 1996, Biotechnology & genetic engineering reviews.

[40]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[41]  G. A. Watson An Algorithm for the Single Facility Location Problem Using the Jaccard Metric , 1983 .

[42]  Susumu Goto,et al.  Systematic Analysis of Enzyme-Catalyzed Reaction Patterns and Prediction of Microbial Biodegradation Pathways , 2007, J. Chem. Inf. Model..

[43]  M. Kanehisa,et al.  Prediction of missing enzyme genes in a bacterial metabolic network , 2007, The FEBS journal.

[44]  Gilles Klopman,et al.  META. 2. A Dictionary Model of Mammalian Xenobiotic Metabolism , 1994, J. Chem. Inf. Comput. Sci..

[45]  Gilles Klopman,et al.  Structure–biodegradability study and computer‐automated prediction of aerobic biodegradation of chemicals , 1997 .

[46]  Jérôme Euzenat,et al.  Grasping at molecular interactions and genetic networks in Drosophila melanogaster using FlyNets, an Internet database , 1999, Nucleic Acids Res..

[47]  Susumu Goto,et al.  SIMCOMP/SUBCOMP: chemical structure search servers for network analyses , 2010, Nucleic Acids Res..