Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks

BackgroundTo infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees.ResultsTo compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes.ConclusionBy combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway structures using meta-level information rather than sequence information. This method may yield further information about biological evolution, such as the history of horizontal transfer of each gene, by studying the detailed structure of the phylogenetic tree constructed by the kernel-based method.

[1]  Klaus Schulten,et al.  Evolution of Metabolisms: A New Method for the Comparison of Metabolic Pathways Using Genomics Information , 1999, J. Comput. Biol..

[2]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[3]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[4]  Daniel H. Huson,et al.  Whole-genome prokaryotic phylogeny , 2005, Bioinform..

[5]  Kaizhong Zhang,et al.  On the Editing Distance Between Undirected Acyclic Graphs , 1996, Int. J. Found. Comput. Sci..

[6]  O. Kandler,et al.  Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Susumu Goto,et al.  LIGAND: database of chemical compounds and reactions in biological pathways , 2002, Nucleic Acids Res..

[8]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[9]  James A. Lake,et al.  Phylogenetic analysis and comparative genomics , 1998 .

[10]  J. Qi,et al.  Whole Proteome Prokaryote Phylogeny Without Sequence Alignment: A K-String Composition Approach , 2003, Journal of Molecular Evolution.

[11]  B. Dujon,et al.  The genomic tree as revealed from whole proteome comparisons. , 1999, Genome research.

[12]  N. Grishin,et al.  Genome trees and the tree of life. , 2002, Trends in genetics : TIG.

[13]  Brian Fritz,et al.  Bacterial genomics: potential for antimicrobial drug discovery. , 2002, BioDrugs : clinical immunotherapeutics, biopharmaceuticals and gene therapy.

[14]  B. Snel,et al.  SHOT: a web server for the construction of genome phylogenies. , 2002, Trends in genetics : TIG.

[15]  W C Wheeler,et al.  The Strepsiptera problem: phylogeny of the holometabolous insect orders inferred from 18S and 28S ribosomal DNA sequences and morphology. , 1997, Systematic biology.

[16]  J. Lake,et al.  Genomic evidence for two functionally distinct gene classes. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[17]  L. Orgel,et al.  Phylogenetic Classification and the Universal Tree , 1999 .

[18]  N. Grishin,et al.  Genome trees constructed using five different approaches suggest new major bacterial clades , 2001, BMC Evolutionary Biology.

[19]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[20]  Roderic D. M. Page,et al.  TreeView: an application to display phylogenetic trees on personal computers , 1996, Comput. Appl. Biosci..

[21]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[22]  K. Schulten,et al.  Phylogenetic Analysis of Metabolic Pathways , 2001, Journal of Molecular Evolution.

[23]  Li Liao,et al.  Genome Comparisons Based on Profiles of Metabolic Pathways , 2002 .

[24]  Khalid Sayood,et al.  A new sequence distance measure for phylogenetic tree construction , 2003, Bioinform..

[25]  Xin Chen,et al.  An information-based sequence distance and its application to whole mitochondrial genome phylogeny , 2001, Bioinform..

[26]  W. Doolittle,et al.  Lateral genomics. , 1999, Trends in cell biology.

[27]  C. Kurland,et al.  The global phylogeny of glycolytic enzymes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[28]  F. Delsuc,et al.  Phylogenomics and the reconstruction of the tree of life , 2005, Nature Reviews Genetics.

[29]  John M. Logsdon,et al.  Archaeal genomics: Do archaea have a mixed heritage? , 1998, Current Biology.

[30]  M. Gouy,et al.  A phylogenomic approach to bacterial phylogeny: evidence of a core of genes sharing a common history. , 2002, Genome research.

[31]  Ambuj K. Singh,et al.  Deriving phylogenetic trees from the similarity analysis of metabolic pathways , 2003, ISMB.

[32]  Doolittle Wf Phylogenetic Classification and the Universal Tree , 1999 .

[33]  W. Gilks,et al.  A novel algorithm and web-based tool for comparing two alternative phylogenetic trees , 2006, Bioinform..

[34]  B. Snel,et al.  Pathway alignment: application to the comparative analysis of glycolytic enzymes. , 1999, The Biochemical journal.

[35]  W. Doolittle,et al.  Tempo, mode, the progenote, and the universal root. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Tandy J. Warnow,et al.  New approaches for reconstructing phylogenies from gene order data , 2001, ISMB.

[37]  S. Fitz-Gibbon,et al.  Whole genome-based phylogenetic analysis of free-living microorganisms. , 1999, Nucleic acids research.

[38]  J. Palmer,et al.  Lateral transfer at the gene and subgenic levels in the evolution of eukaryotic enolase , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[39]  N. Grishin,et al.  From complete genomes to measures of substitution rate variability within and between proteins. , 2000, Genome research.

[40]  J. Lake,et al.  Horizontal gene transfer among genomes: the complexity hypothesis. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[41]  A. Barabasi,et al.  Comparable system-level organization of Archaea and Eukaryotes , 2001, Nature Genetics.

[42]  Maryse Condé Tree of Life , 1992 .

[43]  M. Gerstein,et al.  Whole-genome trees based on the occurrence of folds and orthologs: implications for comparing genomes on different levels. , 2000, Genome research.

[44]  S. Carroll,et al.  Genome-scale approaches to resolving incongruence in molecular phylogenies , 2003, Nature.