iTree: A high-throughput phylogenomic pipeline

Phylogenomics, conventionally defined as the intersection of phylogenetics and genomics, has become a key instrument in a wide spectrum of biological studies, including resolution of complex evolutionary relationships, assignment of taxonomic affiliation, prediction of protein molecular functions, and tracing horizontal gene transfer event. Here, we introduce an open-source phylogenomic pipeline, iTree, which automates the execution of phylogenetic analyses under multithreaded and grid-computing environments, providing a scalable high-throughput platform for performing genome-wide evolutionary analyses. Furthermore, we describe the results of two applications of using iTree: (1) taxonomic assignment of 16S ribosomal RNA sequences from human oral metagenomic samples and (2) detection of horizontal gene transfer in microbial genomes.

[1]  W. Martin,et al.  Evidence for a chimeric nature of nuclear genomes: eubacterial origin of eukaryotic glyceraldehyde-3-phosphate dehydrogenase genes. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Gapped BLAST and PSI-BLAST: A new , 1997 .

[3]  Debashish Bhattacharya,et al.  Algal Phylogeny and the Origin of Land Plants , 1998 .

[4]  J A Eisen,et al.  Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. , 1998, Genome research.

[5]  J. Lake,et al.  Horizontal gene transfer among genomes: the complexity hypothesis. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[6]  L. Koski,et al.  The Closest BLAST Hit Is Often Not the Nearest Neighbor , 2001, Journal of Molecular Evolution.

[7]  Derrick J. Zwickl,et al.  Increased taxon sampling greatly reduces phylogenetic error. , 2002, Systematic biology.

[8]  M. Gouy,et al.  A phylogenomic approach to bacterial phylogeny: evidence of a core of genes sharing a common history. , 2002, Genome research.

[9]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[10]  W. Martin,et al.  Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes , 2004, Nature Reviews Genetics.

[11]  Andrei N Lupas,et al.  PhyloGenie: automated phylome generation and analysis. , 2004, Nucleic acids research.

[12]  Patricia J. Johnson,et al.  Ancient Invasions: From Endosymbionts to Organelles , 2004, Science.

[13]  Kimmen Sjölander,et al.  Phylogenomic inference of protein molecular function: advances and challenges , 2004, Bioinform..

[14]  J. Townsend,et al.  Horizontal gene transfer, genome innovation and evolution , 2005, Nature Reviews Microbiology.

[15]  E. Mardis,et al.  An obesity-associated gut microbiome with increased capacity for energy harvest , 2006, Nature.

[16]  Hilmar Lapp,et al.  Open source tools and toolkits for bioinformatics: significance, and where are we? , 2006, Briefings Bioinform..

[17]  J. Langdale,et al.  A step by step guide to phylogeny reconstruction. , 2006, The Plant journal : for cell and molecular biology.

[18]  Hajime Ishikawa,et al.  The 160-Kilobase Genome of the Bacterial Endosymbiont Carsonella , 2006, Science.

[19]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[20]  W. Ludwig,et al.  SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB , 2007, Nucleic acids research.

[21]  Natalia N. Ivanova,et al.  Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite , 2007, Nature.

[22]  Debashish Bhattacharya,et al.  PhyloSort: a user-friendly phylogenetic sorting tool and its application to estimating the cyanobacterial contribution to the nuclear genome of Chlamydomonas , 2008, BMC Evolutionary Biology.

[23]  Debashish Bhattacharya,et al.  Phylogeny of Calvin cycle enzymes supports Plantae monophyly. , 2007, Molecular phylogenetics and evolution.

[24]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[25]  A. Halpern,et al.  The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific , 2007, PLoS biology.

[26]  Kamran Shalchian-Tabrizi,et al.  Phylogenomics Reshuffles the Eukaryotic Supergroups , 2007, PloS one.

[27]  David Q. Matus,et al.  Broad phylogenomic sampling improves resolution of the animal tree of life , 2008, Nature.

[28]  D. Hillis,et al.  Taxon sampling and the accuracy of phylogenetic analyses , 2008 .

[29]  O. Gascuel,et al.  Estimating maximum likelihood phylogenies with PhyML. , 2009, Methods in molecular biology.

[30]  Natalia N. Ivanova,et al.  A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea , 2009, Nature.

[31]  A. Salamov,et al.  Green Evolution and Dynamic Adaptations Revealed by Genomes of the Marine Picoeukaryotes Micromonas , 2009, Science.

[32]  Debashish Bhattacharya,et al.  Genomic Footprints of a Cryptic Plastid Endosymbiosis in Diatoms , 2009, Science.

[33]  Wen-Han Yu,et al.  The Human Oral Microbiome Database: a web accessible resource for investigating oral microbe taxonomic and genomic information , 2010, Database J. Biol. Databases Curation.

[34]  Ahmed Moustafa,et al.  Differential gene retention in plastids of common recent origin. , 2010, Molecular biology and evolution.

[35]  B. Lang,et al.  Mitochondrial Evolution , 1999 .