Global catalogue of microorganisms (gcm): a comprehensive database and information retrieval, analysis, and visualization system for microbial resources

BackgroundThroughout the long history of industrial and academic research, many microbes have been isolated, characterized and preserved (whenever possible) in culture collections. With the steady accumulation in observational data of biodiversity as well as microbial sequencing data, bio-resource centers have to function as data and information repositories to serve academia, industry, and regulators on behalf of and for the general public. Hence, the World Data Centre for Microorganisms (WDCM) started to take its responsibility for constructing an effective information environment that would promote and sustain microbial research data activities, and bridge the gaps currently present within and outside the microbiology communities.DescriptionStrain catalogue information was collected from collections by online submission. We developed tools for automatic extraction of strain numbers and species names from various sources, including Genbank, Pubmed, and SwissProt. These new tools connect strain catalogue information with the corresponding nucleotide and protein sequences, as well as to genome sequence and references citing a particular strain. All information has been processed and compiled in order to create a comprehensive database of microbial resources, and was named Global Catalogue of Microorganisms (GCM). The current version of GCM contains information of over 273,933 strains, which includes 43,436bacterial, fungal and archaea species from 52 collections in 25 countries and regions.A number of online analysis and statistical tools have been integrated, together with advanced search functions, which should greatly facilitate the exploration of the content of GCM.ConclusionA comprehensive dynamic database of microbial resources has been created, which unveils the resources preserved in culture collections especially for those whose informatics infrastructures are still under development, which should foster cumulative research, facilitating the activities of microbiologists world-wide, who work in both public and industrial research centres. This database is available from http://gcm.wfcc.info.

[1]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[2]  Eugene W. Myers,et al.  Basic local alignment search tool. Journal of Molecular Biology , 1990 .

[3]  С М Озерская,et al.  OECD BEST PRACTICE GUIDELINES FOR BIOLOGICAL RESOURCE CENTRES. OECD, 2007, 115 P , 2008 .

[4]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[5]  Scott Federhen,et al.  The NCBI Taxonomy database , 2011, Nucleic Acids Res..

[6]  Hideaki Sugawara,et al.  Networking of Biological Resource Centers: WDCM experiences , 2002, Data Sci. J..

[7]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[8]  Renzo Kottmann,et al.  Meeting Report: Hackathon-Workshop on Darwin Core and MIxS Standards Alignment (February 2012) , 2012, Standards in genomic sciences.

[9]  The UniProt Consortium,et al.  Reorganizing the protein space at the Universal Protein Resource (UniProt) , 2011, Nucleic Acids Res..

[10]  David S. Goodsell,et al.  The RCSB Protein Data Bank: redesigned web site and web services , 2010, Nucleic Acids Res..

[11]  Towards an index of all known species: the Catalogue of Life, its rationale, design and use. , 2006, Integrative zoology.

[12]  J Smith,et al.  Structuring strain data for storage and retrieval of information on fungi and yeasts in MINE, the Microbial Information Network Europe. , 1988, Journal of general microbiology.

[13]  J. Silberg,et al.  A transposase strategy for creating libraries of circularly permuted proteins , 2012, Nucleic acids research.

[14]  Pedro W. Crous,et al.  MycoBank: an online initiative to launch mycology into the 21st century , 2004 .

[15]  J. Euzéby List of Bacterial Names with Standing in Nomenclature: a folder available on the Internet. , 1997, International journal of systematic bacteriology.