A Comprehensive, Automatically Updated Fungal ITS Sequence Dataset for Reference-Based Chimera Control in Environmental Sequencing Efforts

The nuclear ribosomal internal transcribed spacer (ITS) region is the most commonly chosen genetic marker for the molecular identification of fungi in environmental sequencing and molecular ecology studies. Several analytical issues complicate such efforts, one of which is the formation of chimeric—artificially joined—DNA sequences during PCR amplification or sequence assembly. Several software tools are currently available for chimera detection, but rely to various degrees on the presence of a chimera-free reference dataset for optimal performance. However, no such dataset is available for use with the fungal ITS region. This study introduces a comprehensive, automatically updated reference dataset for fungal ITS sequences based on the UNITE database for the molecular identification of fungi. This dataset supports chimera detection throughout the fungal kingdom and for full-length ITS sequences as well as partial (ITS1 or ITS2 only) datasets. The performance of the dataset on a large set of artificial chimeras was above 99.5%, and we subsequently used the dataset to remove nearly 1,000 compromised fungal ITS sequences from public circulation. The dataset is available at http://unite.ut.ee/repository.php and is subject to web-based third-party curation.

[1]  K. Peay,et al.  Parsing ecological signal from noise in next generation amplicon sequencing. , 2015, The New phytologist.

[2]  R. Henrik Nilsson,et al.  Global diversity and geography of soil fungi , 2014, Science.

[3]  M. Wedin,et al.  Tremella rhizocarpicola sp. nov. and other interesting lichenicolous Tremellales and Filobasidiales in the Nordic countries , 2014 .

[4]  R. Henrik Nilsson,et al.  Finding needles in haystacks: linking scientific names, reference specimens and molecular data for Fungi , 2014, Database J. Biol. Databases Curation.

[5]  R. Henrik Nilsson,et al.  Improving ITS sequence data for identification of plant pathogenic fungi , 2014, Fungal Diversity.

[6]  Leszek P. Pryszcz,et al.  Genome Comparison of Candida orthopsilosis Clinical Strains Reveals the Existence of Hybrids between Two Distinct Subspecies , 2014, Genome biology and evolution.

[7]  L. Tedersoo,et al.  Does host plant richness explain diversity of ectomycorrhizal fungi? Re‐evaluation of Gao et al. (2013) data sets reveals sampling effects , 2014, Molecular ecology.

[8]  Michael Weiss,et al.  Towards a unified paradigm for sequence‐based identification of fungi , 2013, Molecular ecology.

[9]  R. Henrik Nilsson,et al.  Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for analysis of environmental sequencing data , 2013 .

[10]  Robert C. Edgar,et al.  UPARSE: highly accurate OTU sequences from microbial amplicon reads , 2013, Nature Methods.

[11]  Tor Carlsen,et al.  Employing 454 amplicon pyrosequencing to reveal intragenomic divergence in the internal transcribed spacer rDNA region in fungi , 2013, Ecology and evolution.

[12]  Kessy Abarenkov,et al.  Fungal community analysis by high-throughput sequencing of amplified markers – a user's guide , 2013, The New phytologist.

[13]  David S. Hibbett,et al.  Fungal systematics: is a new age of enlightenment at hand? , 2013, Nature Reviews Microbiology.

[14]  Erik Kristiansson,et al.  Incorporating molecular data in fungal systematics: a guide for aspiring researchers , 2013, 1302.3244.

[15]  H. Friberg,et al.  New primers to amplify the fungal ITS2 region--evaluation by 454-sequencing of artificial and natural communities. , 2012, FEMS microbiology ecology.

[16]  Satoshi Yamamoto,et al.  High-Coverage ITS Primers for the DNA-Based Identification of Ascomycetes and Basidiomycetes in Environmental Samples , 2012, PloS one.

[17]  R. Henrik Nilsson,et al.  Five simple guidelines for establishing basic authenticity and reliability of newly generated fungal ITS sequences. , 2012 .

[18]  John L. Spouge,et al.  Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi , 2012, Proceedings of the National Academy of Sciences.

[19]  C. Quince,et al.  Sample richness and genetic diversity as drivers of chimera formation in nSSU metagenetic analyses , 2012, Nucleic acids research.

[20]  Guy Cochrane,et al.  The International Nucleotide Sequence Database Collaboration , 2011, Nucleic Acids Res..

[21]  R. Henrik Nilsson,et al.  Tidying Up International Nucleotide Sequence Databases: Ecological, Geographical and Sequence Quality Annotation of ITS Sequences of Mycorrhizal Fungi , 2011, PloS one.

[22]  Rob Knight,et al.  UCHIME improves sensitivity and speed of chimera detection , 2011, Bioinform..

[23]  G. Cochrane,et al.  The International Nucleotide Sequence Database Collaboration , 2011, Nucleic Acids Res..

[24]  Russell J. Davenport,et al.  Removing Noise From Pyrosequenced Amplicons , 2011, BMC Bioinformatics.

[25]  E. Kristiansson,et al.  An open source chimera checker for the fungal ITS region , 2010, Molecular ecology resources.

[26]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[27]  L. Tedersoo,et al.  454 Pyrosequencing and Sanger sequencing of tropical mycorrhizal fungi provide similar results but reveal substantial methodological biases. , 2010, The New phytologist.

[28]  D. Geiser,et al.  The promise and pitfalls of sequence-based identification of plant-pathogenic fungi and oomycetes. , 2010, Phytopathology.

[29]  Wolfgang Maier,et al.  Current state and perspectives of fungal DNA barcoding and rapid identification procedures , 2010, Applied Microbiology and Biotechnology.

[30]  Andy F. S. Taylor,et al.  The UNITE database for molecular identification of fungi--recent updates and future perspectives. , 2010, The New phytologist.

[31]  Tom Hsiang,et al.  Intergeneric transfer of ribosomal genes between two fungi , 2008, BMC Evolutionary Biology.

[32]  C. Decock,et al.  Hybridization among cryptic species of the cellar fungus Coniophora puteana (Basidiomycota) , 2006, Molecular ecology.

[33]  A. Querol,et al.  Natural hybrids from Saccharomyces cerevisiae, Saccharomyces bayanus and Saccharomyces kudriavzevii in wine fermentations. , 2006, FEMS yeast research.

[34]  D. Rizzo,et al.  Host-parasite relationships among bolete infecting Hypomyces species. , 2003, Mycological research.

[35]  D. Hawksworth The magnitude of fungal diversity: the 1.5 million species estimate revisited * * Paper presented at , 2001 .

[36]  G. Wang,et al.  The frequency of chimeric molecules as a consequence of PCR co-amplification of 16S rRNA genes from different bacterial species. , 1996, Microbiology.

[37]  W. Cibula,et al.  Length variation in the internal transcribed spacer of ribosomal DNA in chanterelles , 1994 .