The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy

The interrogation of genetic markers in environmental meta-barcoding studies is currently seriously hindered by the lack of taxonomically curated reference data sets for the targeted genes. The Protist Ribosomal Reference database (PR2, http://ssu-rrna.org/) provides a unique access to eukaryotic small sub-unit (SSU) ribosomal RNA and DNA sequences, with curated taxonomy. The database mainly consists of nuclear-encoded protistan sequences. However, metazoans, land plants, macrosporic fungi and eukaryotic organelles (mitochondrion, plastid and others) are also included because they are useful for the analysis of high-troughput sequencing data sets. Introns and putative chimeric sequences have been also carefully checked. Taxonomic assignation of sequences consists of eight unique taxonomic fields. In total, 136 866 sequences are nuclear encoded, 45 708 (36 501 mitochondrial and 9657 chloroplastic) are from organelles, the remaining being putative chimeric sequences. The website allows the users to download sequences from the entire and partial databases (including representative sequences after clustering at a given level of similarity). Different web tools also allow searches by sequence similarity. The presence of both rRNA and rDNA sequences, taking into account introns (crucial for eukaryotic sequences), a normalized eight terms ranked-taxonomy and updates of new GenBank releases were made possible by a long-term collaboration between experts in taxonomy and computer scientists.

[1]  R. Christen,et al.  Detection of Introns in Eukaryotic Small Subunit Ribosomal RNA Gene Sequences , 2013 .

[2]  Richard Christen,et al.  Significant and persistent impact of timber harvesting on soil microbial communities in Northern coniferous forests , 2012, The ISME Journal.

[3]  Matthew W. Brown,et al.  The Revised Classification of Eukaryotes , 2012, The Journal of eukaryotic microbiology.

[4]  S. Santini,et al.  Diversity patterns and activity of uncultured marine heterotrophic flagellates unveiled with pyrosequencing , 2012, The ISME Journal.

[5]  L. Farinelli,et al.  Ultra-deep sequencing of foraminiferal microbarcodes unveils hidden richness of early monothalamous lineages in deep-sea sediments , 2011, Proceedings of the National Academy of Sciences.

[6]  L. Amaral-Zettler,et al.  Eukaryotic Richness in the Abyss: Insights from Pyrotag Sequencing , 2011, PloS one.

[7]  J. Bunge,et al.  Protistan microbial observatory in the Cariaco Basin, Caribbean. I. Pyrosequencing vs Sanger insights into species richness , 2011, The ISME Journal.

[8]  T. Stoeck,et al.  Depicting more accurate pictures of protistan community complexity using pyrosequencing of hypervariable SSU rRNA gene regions. , 2011, Environmental microbiology.

[9]  F. Chevenet,et al.  A New Web Server for the Rapid Identification of Microorganisms , 2010 .

[10]  Kazutaka Katoh,et al.  Parallelization of the MAFFT multiple sequence alignment program , 2010, Bioinform..

[11]  Vincent Berry,et al.  ScripTree: scripting phylogenetic graphics , 2010, Bioinform..

[12]  D. Scanlan,et al.  Widespread occurrence and genetic diversity of marine parasitoids belonging to Syndiniales (Alveolata). , 2008, Environmental microbiology.

[13]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[14]  W. Ludwig,et al.  SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB , 2007, Nucleic acids research.

[15]  Eoin L. Brodie,et al.  Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB , 2006, Applied and Environmental Microbiology.

[16]  A. J. Jones,et al.  At Least 1 in 20 16S rRNA Sequence Records Currently Held in Public Repositories Is Estimated To Contain Substantial Anomalies , 2005, Applied and Environmental Microbiology.

[17]  M. Sunde Class I Integron with a Group II Intron Detected in an Escherichia coli Strain from a Free-Range Reindeer , 2005, Antimicrobial Agents and Chemotherapy.

[18]  James R. Cole,et al.  The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis , 2004, Nucleic Acids Res..

[19]  Thomas Huber,et al.  Bellerophon: a program to detect chimeric sequences in multiple sequence alignments , 2004, Bioinform..

[20]  C. Berney,et al.  How many novel eukaryotic 'kingdoms'? Pitfalls and limitations of environmental DNA surveys , 2004, BMC Biology.

[21]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[22]  F. Berthe,et al.  Molecular detection of the oyster parasite Mikrocytos mackini, and a preliminary phylogenetic analysis. , 2003, Diseases of aquatic organisms.

[23]  Thomas Huber,et al.  Chimeric 16S rDNA sequences of diverse origin are accumulating in the public databases. , 2003, International journal of systematic and evolutionary microbiology.

[24]  Detlef D. Leipe,et al.  Evolutionary history of "early-diverging" eukaryotes: the excavate taxon Carpediemonas is a close relative of Giardia. , 2002, Molecular biology and evolution.

[25]  Olivier Gascuel,et al.  Fast and Accurate Phylogeny Reconstruction Algorithms Based on the Minimum-Evolution Principle , 2002, WABI.

[26]  C. Pedrós-Alió,et al.  Unexpected diversity of small eukaryotes in deep-sea Antarctic plankton , 2001, Nature.

[27]  R. Wachter,et al.  Oceanic 18S rDNA sequences from picoplankton reveal unsuspected eukaryotic diversity , 2001, Nature.

[28]  F. Michel,et al.  Multiple group II self-splicing introns in mobile DNA from Escherichia coli. , 1994, Comptes rendus de l'Academie des sciences. Serie III, Sciences de la vie.