A note on the incidence of reverse complementary fungal ITS sequences in the public sequence databases and a software tool for their detection and reorientation

Reverse complementary DNA sequences––sequences that are inadvertently cast backward and in which all purines and pyrimidines are transposed––are not uncommon in sequence databases, where they may introduce noise into sequence-based research. We show that about 1% of the public fungal ITS sequences, the most commonly sequenced genetic marker in mycology, are reverse complementary, and we introduce an open source software solution to automate their detection and reorientation. The MacOSX/Linux/UNIX software operates on public or private datasets of any size, although some 50 base pairs of the 5.8S gene of the ITS region are needed for the analysis.

[1]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Akira Suzuki,et al.  Editorial for the special feature: propagation strategy of fungi , 2009 .

[3]  K. Seifert Progress towards DNA barcoding of fungi , 2009, Molecular ecology resources.

[4]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[5]  Nils Hallenberg,et al.  Preserving accuracy in GenBank , 2008 .

[6]  Elizabeth Pennisi,et al.  Proposal to 'Wikify' GenBank Meets Stiff Resistance , 2008, Science.

[7]  D. Harris,et al.  Can you bank on GenBank , 2003 .

[8]  K. Fliegerová,et al.  Molecular Identification of Anaerobic Rumen Fungi , 2010 .

[9]  R. Henrik Nilsson,et al.  Intraspecific ITS Variability in the Kingdom Fungi as Expressed in the International Sequence Databases and Its Implications for Molecular Species Identification , 2008, Evolutionary bioinformatics online.

[10]  K. Voigt,et al.  Molecular identification of fungi. , 2010 .

[11]  J. Moncalvo,et al.  Fruiting body and soil rDNA sampling detects complementary assemblage of Agaricomycotina (Basidiomycota, Fungi) in a hemlock‐dominated forest plot in southern Ontario , 2008, Molecular ecology.

[12]  D. Hibbett,et al.  Research Coordination Networks: a phylogeny for kingdom Fungi (Deep Hypha). , 2006 .

[13]  J. Moncalvo,et al.  The cantharelloid clade: dealing with incongruent gene trees and phylogenetic reconstruction methods. , 2006, Mycologia.

[14]  Kessy Abarenkov,et al.  V-Xtractor: an open-source, high-throughput software tool to identify and extract hypervariable regions of small subunit (16S/18S) ribosomal RNA gene sequences. , 2010, Journal of Microbiological Methods.

[15]  R Henrik Nilsson,et al.  Automated phylogenetic taxonomy: an example in the homobasidiomycetes (mushroom-forming fungi). , 2005, Systematic biology.

[16]  M. McCormick,et al.  Internal transcribed spacer primers and sequences for improved characterization of basidiomycetous orchid mycorrhizas. , 2008, The New phytologist.

[17]  E. Kristiansson,et al.  An open source chimera checker for the fungal ITS region , 2010, Molecular ecology resources.

[18]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[19]  Cymon J Cox,et al.  WASABI: an automated sequence processing system for multigene phylogenies. , 2007, Systematic biology.

[20]  Erik Kristiansson,et al.  An outlook on the fungal internal transcribed spacer sequences in GenBank and the introduction of a web-based tool for the exploration of fungal diversity. , 2009, The New phytologist.

[21]  Andy F. S. Taylor,et al.  The UNITE database for molecular identification of fungi--recent updates and future perspectives. , 2010, The New phytologist.

[22]  R. Henrik Nilsson,et al.  Approaching the taxonomic affiliation of unidentified sequences in public databases – an example from the mycorrhizal fungi , 2005, BMC Bioinformatics.

[23]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[24]  W. Cibula,et al.  Length variation in the internal transcribed spacer of ribosomal DNA in chanterelles , 1994 .

[25]  D. Hibbett,et al.  After the gold rush, or before the flood? Evolutionary morphology of mushroom-forming fungi (Agaricomycetes) in the early 21st century. , 2007, Mycological research.

[26]  O. Gascuel,et al.  SeaView version 4: A multiplatform graphical user interface for sequence alignment and phylogenetic tree building. , 2010, Molecular biology and evolution.

[27]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[28]  Paul M Kirk,et al.  Fungal ecology catches fire. , 2009, The New phytologist.

[29]  Wolfgang Maier,et al.  Current state and perspectives of fungal DNA barcoding and rapid identification procedures , 2010, Applied Microbiology and Biotechnology.

[30]  Kazutaka Katoh,et al.  Recent developments in the MAFFT multiple sequence alignment program , 2008, Briefings Bioinform..

[31]  D. Hibbett,et al.  Phylogenetic species recognition and species concepts in fungi. , 2000, Fungal genetics and biology : FG & B.