Identification of Fungal DNA Barcode Targets and PCR Primers Based on Pfam Protein Families and Taxonomic Hierarchy

DNA barcoding is the application of DNA sequences of standardized genetic markers for the identification of eukaryotic organisms. We attempted to identify alternative candidate barcode gene targets for the fungal biota from available fungal genomes using a taxonomy-aware processing pipeline. Putative-protein coding sequences were matched to Pfam protein families and aligned to reference Pfam accessions. Conserved sequence blocks were identified in the resulting alignments and degenerate primers were designed. The processing pipeline is described and the resulting candidate gene targets are discussed. The pipeline allows analysis of subsets at various hierarchical, taxonomic levels (selectable by GenBank taxonomy ID or scientific name) of the available reference data, allowing discrete taxonomic groups to be combined into a single subset, or for subordinate taxa to be excluded from the analysis of higher-level taxa. Putative degenerate primer pairs were designed as high as the superkingdom rank for the set of organisms included in the analysis. The identified targets have essential housekeeping functions, like the well known phylogenetic or barcode markers, and most have a better resolution potential to differentiate species among fully sequenced genomes than the most presently used markers. Some of the commonly used species-level phylogenetic markers for fungi, especially tef1-� and rpb2, were not recovered in our analysis because of their existence in multiple copies in single organisms, and because Pfam families do not always correlate with complete proteins.

[1]  O. Gascuel,et al.  SeaView version 4: A multiplatform graphical user interface for sequence alignment and phylogenetic tree building. , 2010, Molecular biology and evolution.

[2]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[3]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[4]  W. John Kress,et al.  A DNA barcode for land plants , 2009, Proceedings of the National Academy of Sciences.

[5]  J. Spatafora Assembling The Fungal Tree of Life (AFTOL) , 2005 .

[6]  Jason E. Stajich,et al.  The Fungi , 2009, Current Biology.

[7]  G. Cardinali,et al.  The Quest for a General and Reliable Fungal DNA Barcode , 2011 .

[8]  T. White Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics , 1990 .

[9]  K. Seifert,et al.  Multiple copies of cytochrome oxidase 1 in species of the fungal genus Fusarium , 2009, Molecular ecology resources.

[10]  Naryttza N. Diaz,et al.  Phylogenetic classification of short environmental DNA fragments , 2008, Nucleic acids research.

[11]  Jos Houbraken,et al.  Prospects for fungus identification using CO1 DNA barcodes, with Penicillium as a test case , 2007, Proceedings of the National Academy of Sciences.

[12]  N. Baeshen,et al.  Biological Identifications Through DNA Barcodes , 2012 .

[13]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[14]  R. Henrik Nilsson,et al.  Taxonomic Reliability of DNA Sequences in Public Sequence Databases: A Fungal Perspective , 2006, PloS one.

[15]  S. Henikoff,et al.  Consensus-degenerate hybrid oligonucleotide primers for amplification of distantly related sequences. , 1998, Nucleic acids research.

[16]  P. Hebert,et al.  bold: The Barcode of Life Data System (http://www.barcodinglife.org) , 2007, Molecular ecology notes.

[17]  Shmuel Pietrokovski,et al.  The Blocks database--a system for protein classification , 1996, Nucleic Acids Res..

[18]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[19]  S. Henikoff,et al.  Automated construction and graphical presentation of protein blocks from unaligned sequences. , 1995, Gene.

[20]  Natalia Ivanova,et al.  Universal primer cocktails for fish DNA barcoding , 2007 .

[21]  K. Seifert Progress towards DNA barcoding of fungi , 2009, Molecular ecology resources.

[22]  A. Rambaut FigTree. Tree Figure Drawing Tool , 2009 .

[23]  D. Hawksworth The magnitude of fungal diversity: the 1.5 million species estimate revisited * * Paper presented at , 2001 .

[24]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[25]  B. Hall,et al.  Phylogenetic relationships among ascomycetes: evidence from an RNA polymerse II subunit. , 1999, Molecular biology and evolution.