Ribosomal Database Project: data and tools for high throughput rRNA analysis

Ribosomal Database Project (RDP; http://rdp.cme.msu.edu/) provides the research community with aligned and annotated rRNA gene sequence data, along with tools to allow researchers to analyze their own rRNA gene sequences in the RDP framework. RDP data and tools are utilized in fields as diverse as human health, microbial ecology, environmental microbiology, nucleic acid chemistry, taxonomy and phylogenetics. In addition to aligned and annotated collections of bacterial and archaeal small subunit rRNA genes, RDP now includes a collection of fungal large subunit rRNA genes. RDP tools, including Classifier and Aligner, have been updated to work with this new fungal collection. The use of high-throughput sequencing to characterize environmental microbial populations has exploded in the past several years, and as sequence technologies have improved, the sizes of environmental datasets have increased. With release 11, RDP is providing an expanded set of tools to facilitate analysis of high-throughput data, including both single-stranded and paired-end reads. In addition, most tools are now available as open source packages for download and local use by researchers with high-volume needs or who would like to develop custom analysis pipelines.

[1]  Elon Portugaly,et al.  Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space , 2008, ISMB.

[2]  William G. Mckendree,et al.  ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences , 2009, Nucleic acids research.

[3]  Kuan-Liang Liu,et al.  Accurate, Rapid Taxonomic Classification of Fungal Large-Subunit rRNA Genes , 2011, Applied and Environmental Microbiology.

[4]  G. Cochrane,et al.  The International Nucleotide Sequence Database Collaboration , 2011, Nucleic Acids Res..

[5]  F. J. de Bruijn,et al.  Handbook of Molecular Microbial Ecology I: Metagenomics and Complementary Approaches , 2011 .

[6]  E. Stackebrandt Taxonomic parameters revisited : tarnished gold standards , 2006 .

[7]  Ross A. Overbeek,et al.  The ribosomal database project , 1992, Nucleic Acids Res..

[8]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[9]  William A. Walters,et al.  Impact of training sets on classification of high-throughput bacterial 16s rRNA gene surveys , 2011, The ISME Journal.

[10]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[11]  Sarah L. Westcott,et al.  Development of a Dual-Index Sequencing Strategy and Curation Pipeline for Analyzing Amplicon Sequence Data on the MiSeq Illumina Sequencing Platform , 2013, Applied and Environmental Microbiology.

[12]  Jordan A. Fish,et al.  FunGene: the functional gene pipeline and repository , 2013, Front. Microbiol..

[13]  James R. Cole,et al.  The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis , 2004, Nucleic Acids Res..

[14]  James R. Cole,et al.  The Ribosomal Database Project: improved alignments and new tools for rRNA analysis , 2008, Nucleic Acids Res..

[15]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[16]  Raul Munoz,et al.  Release LTPs104 of the All-Species Living Tree. , 2011, Systematic and applied microbiology.

[17]  A. Halpern,et al.  Weighted neighbor joining: a likelihood-based approach to distance-based phylogeny reconstruction. , 2000, Molecular biology and evolution.

[18]  Guus Roeselers,et al.  The effect of training set on the classification of honey bee gut microbiota using the Naïve Bayesian Classifier , 2012, BMC Microbiology.

[19]  Eugene W. Myers,et al.  A fast bit-vector algorithm for approximate string matching based on dynamic programming , 1998, JACM.

[20]  Susan M. Huse,et al.  Microbial diversity in the deep sea and the underexplored “rare biosphere” , 2006, Proceedings of the National Academy of Sciences.

[21]  J. Tiedje,et al.  Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy , 2007, Applied and Environmental Microbiology.

[22]  Daniel G. Brown,et al.  PANDAseq: paired-end assembler for illumina sequences , 2012, BMC Bioinformatics.

[23]  Xiu Lin,et al.  Facing growth in the European Nucleotide Archive , 2012, Nucleic Acids Res..

[24]  J. Tiedje,et al.  The Ribosomal Database Project: Sequences and Software for High-Throughput rRNA Analysis , 2011 .

[25]  Andreas Wilke,et al.  The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome , 2012, GigaScience.

[26]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[27]  Scott Federhen,et al.  The NCBI Taxonomy database , 2011, Nucleic Acids Res..

[28]  Rob Knight,et al.  UCHIME improves sensitivity and speed of chimera detection , 2011, Bioinform..

[29]  W. D. de Vos,et al.  Comparative Analysis of Pyrosequencing and a Phylogenetic Microarray for Exploring Microbial Community Structures in the Human Distal Intestine , 2009, PloS one.

[30]  J. Lafay,et al.  Phylogeny of some Fusarium species, as determined by large-subunit rRNA sequence comparison. , 1989, Molecular biology and evolution.

[31]  R. Gutell,et al.  Comprehensive comparison of structural characteristics in eukaryotic cytoplasmic large subunit (23 S-like) ribosomal RNA. , 1996, Journal of molecular biology.

[32]  Aidan C. Parte,et al.  LPSN—list of prokaryotic names with standing in nomenclature , 2013, Nucleic Acids Res..

[33]  Sean R. Eddy,et al.  Infernal 1.1: 100-fold faster RNA homology searches , 2013, Bioinform..

[34]  Yves Van de Peer,et al.  Compilation of small ribosomal subunit RNA structures , 1993, Nucleic Acids Res..

[35]  Pavel A. Pevzner,et al.  Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals , 1999, J. ACM.

[36]  Susan M. Huse,et al.  Ironing out the wrinkles in the rare biosphere through improved OTU clustering , 2010, Environmental microbiology.

[37]  Woo Jun Sul,et al.  Bacterial community comparisons by taxonomy-supervised analysis independent of sequence alignment and clustering , 2011, Proceedings of the National Academy of Sciences.

[38]  Tatiana A. Tatusova,et al.  BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata , 2011, Nucleic Acids Res..

[39]  Nan Yu,et al.  The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs , 2002, BMC Bioinformatics.

[40]  Guy Cochrane,et al.  The International Nucleotide Sequence Database Collaboration , 2011, Nucleic Acids Res..

[41]  Emily S. Charlson,et al.  Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications , 2011, Nature Biotechnology.