PlutoF—a Web Based Workbench for Ecological and Taxonomic Research, with an Online Implementation for Fungal ITS Sequences

DNA sequences accumulating in the International Nucleotide Sequence Databases (INSD) form a rich source of information for taxonomic and ecological meta-analyses. However, these databases include many erroneous entries, and the data itself is poorly annotated with metadata, making it difficult to target and extract entries of interest with any degree of precision. Here we describe the web-based workbench PlutoF, which is designed to bridge the gap between the needs of contemporary research in biology and the existing software resources and databases. Built on a relational database, PlutoF allows remote-access rapid submission, retrieval, and analysis of study, specimen, and sequence data in INSD as well as for private datasets though web-based thin clients. In contrast to INSD, PlutoF supports internationally standardized terminology to allow very specific annotation and linking of interacting specimens and species. The sequence analysis module is optimized for identification and analysis of environmental ITS sequences of fungi, but it can be modified to operate on any genetic marker and group of organisms. The workbench is available at http://plutof.ut.ee.

[1]  Erik Kristiansson,et al.  Mining metadata from unidentified ITS sequences in GenBank: A case study in Inocybe (Basidiomycota) , 2008, BMC Evolutionary Biology.

[2]  Thomas Huber,et al.  Bellerophon: a program to detect chimeric sequences in multiple sequence alignments , 2004, Bioinform..

[3]  Andy F. S. Taylor,et al.  The UNITE database for molecular identification of fungi--recent updates and future perspectives. , 2010, The New phytologist.

[4]  Michael Weiss,et al.  A higher-level phylogenetic classification of the Fungi. , 2007, Mycological research.

[5]  Renzo Kottmann,et al.  Microbiological Common Language (MCL): a standard for electronic information exchange in the Microbial Commons. , 2010, Research in microbiology.

[6]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[7]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[8]  R. Henrik Nilsson,et al.  Intraspecific ITS Variability in the Kingdom Fungi as Expressed in the International Sequence Databases and Its Implications for Molecular Species Identification , 2008, Evolutionary bioinformatics online.

[9]  Konstantinos T Konstantinidis,et al.  Prokaryotic taxonomy and phylogeny in the genomic era: advancements and challenges ahead. , 2007, Current opinion in microbiology.

[10]  H. Callahan,et al.  Interwoven branches of the plant and fungal trees of life. , 2010, The New phytologist.

[11]  J Davison,et al.  The online database MaarjAM reveals global and ecosystemic distribution patterns in arbuscular mycorrhizal fungi (Glomeromycota). , 2010, The New phytologist.

[12]  Mark Blaxter,et al.  A new system for Nematoda: combining morphological characters with molecular trees, and translating clades into ranks and taxa , 2004, Proceedings of the Fourth International Congress of Nematology, 8-13 June 2002, Tenerife, Spain.

[13]  R. Knight,et al.  Rapid denoising of pyrosequencing amplicon data: exploiting the rank-abundance distribution , 2010, Nature Methods.

[14]  R. Henrik Nilsson,et al.  A note on the incidence of reverse complementary fungal ITS sequences in the public sequence databases and a software tool for their detection and reorientation , 2011, Mycoscience.

[15]  Elizabeth Pennisi,et al.  Proposal to 'Wikify' GenBank Meets Stiff Resistance , 2008, Science.

[16]  L. Tedersoo,et al.  454 Pyrosequencing and Sanger sequencing of tropical mycorrhizal fungi provide similar results but reveal substantial methodological biases. , 2010, The New phytologist.

[17]  R. Knight,et al.  Global patterns in bacterial diversity , 2007, Proceedings of the National Academy of Sciences.

[18]  Robin Sen,et al.  UNITE: a database providing web-based methods for the molecular identification of ectomycorrhizal fungi. , 2005, The New phytologist.

[19]  R. Henrik Nilsson,et al.  Approaching the taxonomic affiliation of unidentified sequences in public databases – an example from the mycorrhizal fungi , 2005, BMC Bioinformatics.

[20]  F. Martin,et al.  454 Pyrosequencing analyses of forest soils reveal an unexpectedly high fungal diversity. , 2009, The New phytologist.

[21]  D. L. Nielsen,et al.  Ecology versus taxonomy: is there a middle ground? , 2004, Hydrobiologia.

[22]  Kessy Abarenkov,et al.  Rethinking taxon sampling in the light of environmental sequencing , 2011 .

[23]  Kessy Abarenkov,et al.  V-Xtractor: an open-source, high-throughput software tool to identify and extract hypervariable regions of small subunit (16S/18S) ribosomal RNA gene sequences. , 2010, Journal of microbiological methods.

[24]  E. Kristiansson,et al.  An open source chimera checker for the fungal ITS region , 2010, Molecular ecology resources.

[25]  R. Henrik Nilsson,et al.  An open source software package for automated extraction of ITS1 and ITS2 from fungal ITS sequences for use in high-throughput community assays and molecular ecology , 2010 .

[26]  W. Ludwig,et al.  SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB , 2007, Nucleic acids research.

[27]  Hanlee P. Ji,et al.  Next-generation DNA sequencing , 2008, Nature Biotechnology.

[28]  Les Dethlefsen,et al.  The Pervasive Effects of an Antibiotic on the Human Gut Microbiota, as Revealed by Deep 16S rRNA Sequencing , 2008, PLoS biology.

[29]  Eoin L. Brodie,et al.  Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB , 2006, Applied and Environmental Microbiology.

[30]  Ursula Eberhardt A constructive step towards selecting a DNA barcode for fungi. , 2010, The New phytologist.

[31]  Thomas D. Bruns,et al.  Fungal Community Ecology: A Hybrid Beast with a Molecular Master , 2008 .

[32]  D. Tautz,et al.  A plea for DNA taxonomy , 2003 .

[33]  A. Oren Prokaryote diversity and taxonomy: current status and future challenges. , 2004, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[34]  Susan M. Huse,et al.  Ironing out the wrinkles in the rare biosphere through improved OTU clustering , 2010, Environmental microbiology.

[35]  Susan M. Huse,et al.  Accuracy and quality of massively parallel DNA pyrosequencing , 2007, Genome Biology.

[36]  Jason E. Stajich,et al.  The Fungi , 2009, Current Biology.

[37]  Wolfgang Maier,et al.  Current state and perspectives of fungal DNA barcoding and rapid identification procedures , 2010, Applied Microbiology and Biotechnology.

[38]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[39]  T. May,et al.  Ectomycorrhizal lifestyle in fungi: global diversity, distribution, and evolution of phylogenetic lineages , 2010, Mycorrhiza.

[40]  David C. Tank,et al.  An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: , 2009 .

[41]  J. Deckers,et al.  World Reference Base for Soil Resources , 1998 .

[42]  Deutsche Ausgabe World Reference Base for Soil Resources 2006 , 2007 .

[43]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[44]  Nils Hallenberg,et al.  Preserving accuracy in GenBank , 2008 .

[45]  Richard L. Pyle,et al.  Taxonomer: a relational data model for managing information relevant to taxonomic research , 2004 .

[46]  C. Quince,et al.  Accurate determination of microbial diversity from 454 pyrosequencing data , 2009, Nature Methods.

[47]  Richard Christen,et al.  Global sequencing: a review of current molecular data and new methods available to assess microbial diversity. , 2008, Microbes and environments.

[48]  G. Giribet,et al.  A modern approach to rotiferan phylogeny: combining morphological and molecular data. , 2006, Molecular phylogenetics and evolution.

[49]  Walter R. Gilks,et al.  Modeling the percolation of annotation errors in a database of protein sequences , 2002, Bioinform..

[50]  Erik Kristiansson,et al.  An outlook on the fungal internal transcribed spacer sequences in GenBank and the introduction of a web-based tool for the exploration of fungal diversity. , 2009, The New phytologist.