A laboratory information management system for DNA barcoding workflows.

This paper presents a laboratory information management system for DNA sequences (LIMS) created and based on the needs of a DNA barcoding project at the CBS-KNAW Fungal Biodiversity Centre (Utrecht, the Netherlands). DNA barcoding is a global initiative for species identification through simple DNA sequence markers. We aim at generating barcode data for all strains (or specimens) included in the collection (currently ca. 80 k). The LIMS has been developed to better manage large amounts of sequence data and to keep track of the whole experimental procedure. The system has allowed us to classify strains more efficiently as the quality of sequence data has improved, and as a result, up-to-date taxonomic names have been given to strains and more accurate correlation analyses have been carried out.

[1]  V. Solovyev,et al.  Assignment of position-specific error probability to primary DNA sequence data. , 1994, Nucleic acids research.

[2]  D. Hibbett,et al.  Research Coordination Networks: a phylogeny for kingdom Fungi (Deep Hypha). , 2006 .

[3]  N. Talbot,et al.  Fungal physiology - a future perspective. , 2009, Microbiology.

[4]  V. Robert,et al.  Allev, a New Program for Computer-assisted Identification of Yeasts , 1994 .

[5]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[6]  C. Schneider,et al.  Using Molecular-Assisted Alpha Taxonomy to Better Understand Red Algal Biodiversity in Bermuda , 2010 .

[7]  Dorothea Emig,et al.  Partitioning biological data with transitivity clustering , 2010, Nature Methods.

[8]  Sven Rahmann,et al.  Large scale clustering of protein sequences with FORCE -A layout based heuristic for weighted cluster editing , 2007, BMC Bioinformatics.

[9]  Anton J. Enright,et al.  Protein families and TRIBES in genome sequence space. , 2003, Nucleic acids research.

[10]  K. Seifert Progress towards DNA barcoding of fungi , 2009, Molecular ecology resources.

[11]  R. H. Nilsson,et al.  Molecular Identification of Fungi: Rationale, Philosophical Concerns, and the UNITE Database , 2011 .

[12]  Kevin de Queiroz,et al.  Species Concepts and Species Delimitation , 2007 .

[13]  D. Hawksworth The magnitude of fungal diversity: the 1.5 million species estimate revisited * * Paper presented at , 2001 .

[14]  Gianluigi Cardinali,et al.  BioloMICS Software: Biological Data Management, Identification, Classification and Statistics , 2011 .

[15]  K. Konstantinidis,et al.  Genomic insights that advance the species definition for prokaryotes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Sven Rahmann,et al.  Exact and heuristic algorithms for weighted cluster editing. , 2007, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[17]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[18]  Teun Boekhout,et al.  The yeasts : a taxonomic study , 1972 .

[19]  M. Bidartondo,et al.  How to know unknown fungi: the role of a herbarium. , 2009, The New phytologist.

[20]  G. Wörheide,et al.  On the molecular phylogeny of sponges (Porifera) , 2007 .

[21]  Wolfgang Maier,et al.  Current state and perspectives of fungal DNA barcoding and rapid identification procedures , 2010, Applied Microbiology and Biotechnology.

[22]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[23]  Alberto Paccanaro,et al.  Spectral clustering of protein sequences , 2006, Nucleic Acids Research.

[24]  R. Henrik Nilsson,et al.  Taxonomic Reliability of DNA Sequences in Public Sequence Databases: A Fungal Perspective , 2006, PloS one.

[25]  Alexander Schliep,et al.  Clustering Protein Sequences ? Structure Prediction by Transitive Homology , 2001, German Conference on Bioinformatics.

[26]  V. Robert,et al.  BCCM(TM)/Allev 2.00 an automated system for the identification of yeasts , 1997 .

[27]  Martin Vingron,et al.  Large scale hierarchical clustering of protein sequences , 2005, BMC Bioinformatics.

[28]  T. Bruns,et al.  Fungal networks made of humans: UNITE, FESIN, and frontiers in fungal ecology. , 2008, The New phytologist.

[29]  Jeremy R. deWaard,et al.  Biological identifications through DNA barcodes , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[30]  S. Miller DNA barcoding and the renaissance of taxonomy , 2007, Proceedings of the National Academy of Sciences.

[31]  Nils Hallenberg,et al.  Preserving accuracy in GenBank , 2008 .

[32]  Jason Lee,et al.  BAG: a graph theoretic sequence clustering algorithm , 2006, Int. J. Data Min. Bioinform..

[33]  Kenji Matsuura,et al.  Reconstructing the early evolution of Fungi using a six-gene phylogeny , 2006, Nature.

[34]  John L. Spouge,et al.  Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi , 2012, Proceedings of the National Academy of Sciences.

[35]  P. Hebert,et al.  Barcode of life. , 2008, Scientific American.

[36]  J. Varga,et al.  The current status of species recognition and identification in Aspergillus , 2007, Studies in mycology.