XplorSeq: A software environment for integrated management and phylogenetic analysis of metagenomic sequence data

BackgroundAdvances in automated DNA sequencing technology have accelerated the generation of metagenomic DNA sequences, especially environmental ribosomal RNA gene (rDNA) sequences. As the scale of rDNA-based studies of microbial ecology has expanded, need has arisen for software that is capable of managing, annotating, and analyzing the plethora of diverse data accumulated in these projects.ResultsXplorSeq is a software package that facilitates the compilation, management and phylogenetic analysis of DNA sequences. XplorSeq was developed for, but is not limited to, high-throughput analysis of environmental rRNA gene sequences. XplorSeq integrates and extends several commonly used UNIX-based analysis tools by use of a Macintosh OS-X-based graphical user interface (GUI). Through this GUI, users may perform basic sequence import and assembly steps (base-calling, vector/primer trimming, contig assembly), perform BLAST (Basic Local Alignment and Search Tool; [1–3]) searches of NCBI and local databases, create multiple sequence alignments, build phylogenetic trees, assemble Operational Taxonomic Units, estimate biodiversity indices, and summarize data in a variety of formats. Furthermore, sequences may be annotated with user-specified meta-data, which then can be used to sort data and organize analyses and reports. A document-based architecture permits parallel analysis of sequence data from multiple clones or amplicons, with sequences and other data stored in a single file.ConclusionXplorSeq should benefit researchers who are engaged in analyses of environmental sequence data, especially those with little experience using bioinformatics software. Although XplorSeq was developed for management of rDNA sequence data, it can be applied to most any sequencing project. The application is available free of charge for non-commercial use at http://vent.colorado.edu/phyloware.

[1]  N. Pace,et al.  Microbial Community Biofabrics in a Geothermal Mine Adit , 2007, Applied and Environmental Microbiology.

[2]  N. Pace,et al.  Community and cultivation analysis of arsenite oxidizing biofilms at Hot Creek. , 2006, Environmental microbiology.

[3]  S. Kelley,et al.  Molecular survey of aeroplane bacterial contamination , 2005, Journal of applied microbiology.

[4]  N. Pace,et al.  Hydrogen and bioenergetics in the Yellowstone geothermal ecosystem. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Jo Handelsman,et al.  Miniprimer PCR, a New Lens for Viewing the Microbial World , 2007, Applied and Environmental Microbiology.

[6]  E. Mardis,et al.  An obesity-associated gut microbiome with increased capacity for energy harvest , 2006, Nature.

[7]  N. Pace,et al.  Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases , 2007, Proceedings of the National Academy of Sciences.

[8]  N. Pace,et al.  Microbial Ecology and Energetics in Yellowstone Hot Springs , 2006 .

[9]  J. Handelsman,et al.  The last word: books as a statistical metaphor for microbial communities. , 2007, Annual review of microbiology.

[10]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[11]  K. Schleifer,et al.  ARB: a software environment for sequence data. , 2004, Nucleic acids research.

[12]  N. Pace,et al.  Composition and Structure of Microbial Communities from Stromatolites of Hamelin Pool in Shark Bay, Western Australia , 2005, Applied and Environmental Microbiology.

[13]  James A. Foster,et al.  Phylogenetics Clearcut : a fast implementation of relaxed neighbor joining , 2006 .

[14]  Thomas Ludwig,et al.  RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees , 2005, Bioinform..

[15]  Scott T Kelley,et al.  Culture-independent analysis of bacterial diversity in a child-care facility , 2007, BMC Microbiology.

[16]  W. Ludwig,et al.  SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB , 2007, Nucleic acids research.

[17]  T. Kieft,et al.  Subsurface Microbial Diversity in Deep-Granitic-Fracture Water in Colorado , 2007, Applied and Environmental Microbiology.

[18]  Philip Hugenholtz,et al.  NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes , 2006, Nucleic Acids Res..

[19]  R. Knight,et al.  Evolution of Mammals and Their Gut Microbes , 2008, Science.

[20]  N. Pace,et al.  Metagenomic approaches for defining the pathogenesis of inflammatory bowel diseases. , 2008, Cell host & microbe.

[21]  N. Pace,et al.  Phylogenetic Composition of Rocky Mountain Endolithic Microbial Ecosystems , 2007, Applied and Environmental Microbiology.

[22]  J. Handelsman,et al.  Introducing DOTUR, a Computer Program for Defining Operational Taxonomic Units and Estimating Species Richness , 2005, Applied and Environmental Microbiology.

[23]  N. Pace,et al.  Culture-Independent Analysis of Indomethacin-Induced Alterations in the Rat Gastrointestinal Microbiota , 2006, Applied and Environmental Microbiology.

[24]  Leah M. Feazel,et al.  Eucaryotic Diversity in a Hypersaline Microbial Mat , 2007, Applied and Environmental Microbiology.

[25]  Rodrigo Lopez,et al.  Multiple sequence alignment with the Clustal series of programs , 2003, Nucleic Acids Res..

[26]  N. Pace,et al.  Microbial diversity in chronic open wounds , 2009, Wound repair and regeneration : official publication of the Wound Healing Society [and] the European Tissue Repair Society.

[27]  R. Knight,et al.  UniFrac: a New Phylogenetic Method for Comparing Microbial Communities , 2005, Applied and Environmental Microbiology.

[28]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[29]  A. Magurran,et al.  Measuring Biological Diversity , 2004 .

[30]  Jeffrey I. Gordon,et al.  Reciprocal Gut Microbiota Transplants from Zebrafish and Mice to Germ-free Recipients Reveal Host Habitat Selection , 2006, Cell.

[31]  N. Pace,et al.  Molecular identification of bacteria in bronchoalveolar lavage fluid from children with cystic fibrosis , 2007, Proceedings of the National Academy of Sciences.

[32]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[33]  R. Reid,et al.  Sulfate reducing bacteria in microbial mats: Changing paradigms, new discoveries , 2006 .

[34]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information , 2021, Nucleic Acids Res..

[35]  N. Pace,et al.  Gastrointestinal microbiology enters the metagenomics era , 2008, Current opinion in gastroenterology.

[36]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[37]  A. Solow,et al.  Measuring biological diversity , 2006, Environmental and Ecological Statistics.

[38]  N. Pace,et al.  Geobiology of a microbial endolithic community in the Yellowstone geothermal environment , 2005, Nature.

[39]  L. Fulton,et al.  Diet-induced obesity is linked to marked but reversible alterations in the mouse distal gut microbiome. , 2008, Cell host & microbe.

[40]  F. Bäckhed,et al.  Obesity alters gut microbial ecology. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Scott R. Miller,et al.  Unexpected Diversity and Complexity of the Guerrero Negro Hypersaline Microbial Mat , 2006, Applied and Environmental Microbiology.

[42]  J. Thompson,et al.  Using CLUSTAL for multiple sequence alignments. , 1996, Methods in enzymology.

[43]  N. Pace,et al.  Hydrogen and Primary Productivity : Inference of Biogeochemistry from Phylogeny in a Geothermal Ecosystem , 2006 .