A Fast and Accurate Algorithm for Comparative Analysis of metabolic Pathways

UNLABELLED Pathways show how different biochemical entities interact with one another to perform vital functions for the survival of an organism. Comparative analysis of pathways is crucial in identifying functional similarities that are difficult to identify by comparing individual entities that build up these pathways. When interacting entities are of single type, the problem of identifying similarities by aligning the pathways can be reduced to graph isomorphism problem. For pathways with varying types of entities such as metabolic pathways, alignment problem is even more challenging. In order to simplify this problem, existing methods often reduce metabolic pathways to graphs with restricted topologies and single type of nodes. However, these abstractions reduce the relevance of the alignment significantly as they cause losses in the information content. In this paper, we describe an algorithm to solve the pairwise alignment problem for metabolic pathways. A distinguishing feature of our method is that it aligns different types of entities, such as enzymes, reactions and compounds. Also, our approach is free of any abstraction in modeling the pathways. We pursue the intuition that both pairwise similarities of entities (homology) and the organization of their interactions (topology) are important for metabolic pathway alignment. In our algorithm, we account for both by creating an eigenvalue problem for each entity type. We enforce the consistency while combining the alignments of different entity types by considering the reachability sets of entities. Our experiments show that our method finds biologically and statistically significant alignments in the order of milliseconds. AVAILABILITY Our software and the source code in C programming language is available at http://bioinformatics.cise.ufl.edu/pal.html.

[1]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[2]  C. Francke,et al.  Reconstructing the metabolic network of a bacterium from its genome. , 2005, Trends in microbiology.

[3]  Wojciech Szpankowski,et al.  Pairwise Local Alignment of Protein Interaction Networks Guided by Models of Evolution , 2005, RECOMB.

[4]  Bin Song,et al.  Mining Metabolic Networks for Optimal Drug Targets , 2007, Pacific Symposium on Biocomputing.

[5]  Michael Lässig,et al.  Local graph alignment and motif search in biological networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[6]  P. Karp,et al.  Computational prediction of human metabolic pathways from the complete human genome , 2004, Genome Biology.

[7]  Peter D. Karp,et al.  EcoCyc: a comprehensive database resource for Escherichia coli , 2004, Nucleic Acids Res..

[8]  Dennis Shasha,et al.  Fast Structural Search in Phylogenetic Databases , 2005, Evolutionary bioinformatics online.

[9]  Ron Y. Pinter,et al.  Alignment of metabolic pathways , 2005, Bioinform..

[10]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[11]  Bin Song,et al.  Double iterative optimisation for metabolic network-based drug target identification , 2009, Int. J. Data Min. Bioinform..

[12]  Denis Thieffry,et al.  Formalisation of regulatory networks: a logical method and its automatization , 1993 .

[13]  Gultekin Özsoyoglu,et al.  Pathways Database System: An Integrated System for Biological Pathways , 2003, Bioinform..

[14]  M. Kanehisa,et al.  Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. , 2003, Journal of the American Chemical Society.

[15]  R Thomas,et al.  Dynamical behaviour of biological regulatory networks--I. Biological role of feedback loops and practical use of the concept of the loop-characteristic state. , 1995, Bulletin of mathematical biology.

[16]  Richard M. Karp,et al.  Comparing Protein Interaction Networks via a Graph Match-and-Split Algorithm , 2007, J. Comput. Biol..

[17]  Roded Sharan,et al.  QNet: A Tool for Querying Protein Interaction Networks , 2007, RECOMB.

[18]  Gultekin Özsoyoglu,et al.  Pathways database system: an integrated set of tools for biological pathways , 2003, SAC '03.

[19]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[20]  Ádám M. Halász,et al.  Investigating metabolite essentiality through genome-scale analysis of Escherichia coli production capabilities , 2005, Bioinform..

[21]  Kenji Satou,et al.  Finding conserved and non-conserved reactions using a metabolic pathway alignment algorithm. , 2006, Genome informatics. International Conference on Genome Informatics.

[22]  Jerzy Tiuryn,et al.  Identification of functional modules from conserved ancestral protein-protein interactions , 2007, ISMB/ECCB.

[23]  Bernhard O. Palsson,et al.  Metabolic flux balance analysis and the in silico analysis of Escherichia coli K-12 gene deletions , 2000, BMC Bioinformatics.

[24]  Hui Xiong,et al.  Identification of Functional Modules in Protein Complexes via Hyperclique Pattern Discovery , 2004, Pacific Symposium on Biocomputing.

[25]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[26]  Kenji Satou,et al.  Phylogenetic reconstruction from non-genomic data , 2007, Bioinform..

[27]  Peter Damaschke,et al.  Induced Subgraph Isomorphism for Cographs in NP-Complete , 1990, WG.

[28]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[29]  A. Arkin,et al.  Stochastic kinetic analysis of developmental pathway bifurcation in phage lambda-infected Escherichia coli cells. , 1998, Genetics.

[30]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[31]  Taher H. Haveliwala,et al.  The Second Eigenvalue of the Google Matrix , 2003 .

[32]  Kenji Satou,et al.  Reconstruction of phylogenetic relationships from metabolic pathways based on the enzyme hierarchy and the gene ontology. , 2005, Genome informatics. International Conference on Genome Informatics.

[33]  Stuart A. Kauffman,et al.  The origins of order , 1993 .

[34]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology , 2003, Nucleic Acids Res..

[35]  Tim Dwyer,et al.  Representing Experimental Biological Data in Metabolic Networks , 2004, APBC.

[36]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[37]  Bonnie Berger,et al.  Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology , 2007, RECOMB.

[38]  Wojciech Szpankowski,et al.  An efficient algorithm for detecting frequent subgraphs in biological networks , 2004, ISMB/ECCB.

[39]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[40]  Sanjay Ranka,et al.  An Iterative Algorithm for Metabolic Network-Based Drug Target Identification , 2006, Pacific Symposium on Biocomputing.

[41]  Hideo Matsuda,et al.  A Multiple Alignment Algorithm for Metabolic Pathway Analysis Using Enzyme Hierarchy , 2000, ISMB.

[42]  Yukako Tohsato,et al.  Metabolic Pathway Alignment Based on Similarity between Chemical Structures , 2007, Inf. Media Technol..

[43]  Eran Segal,et al.  A Feature-Based Approach to Modeling Protein–DNA Interactions , 2007, RECOMB.

[44]  André O. Hudson,et al.  l,l-diaminopimelate aminotransferase, a trans-kingdom enzyme shared by Chlamydia and plants for synthesis of diaminopimelate/lysine , 2006, Proceedings of the National Academy of Sciences.

[45]  Roland Somogyi,et al.  Modeling the complexity of genetic networks: Understanding multigenic and pleiotropic regulation , 1996, Complex..

[46]  Peter D. Karp,et al.  A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases , 2004, BMC Bioinformatics.

[47]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[48]  Juhan Kim,et al.  Why metabolic enzymes are essential or nonessential for growth of Escherichia coli K12 on glucose. , 2007, Biochemistry.

[49]  An-Ping Zeng,et al.  Decomposition of metabolic network into functional modules based on the global connectivity structure of reaction graph , 2004, Bioinform..

[50]  Justin A. Ionita,et al.  Metabolic networks: enzyme function and metabolite structure. , 2004, Current opinion in structural biology.

[51]  Ambuj K. Singh,et al.  Deriving phylogenetic trees from the similarity analysis of metabolic pathways , 2003, ISMB.