MicroScope—an integrated resource for community expertise of gene functions and comparative analysis of microbial genomic and metabolic data

Abstract The overwhelming list of new bacterial genomes becoming available on a daily basis makes accurate genome annotation an essential step that ultimately determines the relevance of thousands of genomes stored in public databanks. The MicroScope platform (http://www.genoscope.cns.fr/agc/microscope) is an integrative resource that supports systematic and efficient revision of microbial genome annotation, data management and comparative analysis. Starting from the results of our syntactic, functional and relational annotation pipelines, MicroScope provides an integrated environment for the expert annotation and comparative analysis of prokaryotic genomes. It combines tools and graphical interfaces to analyze genomes and to perform the manual curation of gene function in a comparative genomics and metabolic context. In this article, we describe the free-of-charge MicroScope services for the annotation and analysis of microbial (meta)genomes, transcriptomic and re-sequencing data. Then, the functionalities of the platform are presented in a way providing practical guidance and help to the nonspecialists in bioinformatics. Newly integrated analysis tools (i.e. prediction of virulence and resistance genes in bacterial genomes) and original method recently developed (the pan-genome graph representation) are also described. Integrated environments such as MicroScope clearly contribute, through the user community, to help maintaining accurate resources.

[1]  Dan M. Bolser,et al.  Ensembl Genomes 2016: more genomes, more complexity , 2015, Nucleic Acids Res..

[2]  Peter D. Karp,et al.  Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology , 2015, Briefings Bioinform..

[3]  Monica Riley,et al.  GenProtEC: an updated and improved analysis of functions of Escherichia coli K-12 proteins , 2004, Nucleic Acids Res..

[4]  M. Gerstein,et al.  Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. , 2000, Journal of molecular biology.

[5]  Grégory Nuel,et al.  AMIGene: Annotation of MIcrobial Genes , 2003, Nucleic Acids Res..

[6]  Georgios S. Vernikos,et al.  Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands , 2006, Bioinform..

[7]  Anne Morgat,et al.  Updates in Rhea – an expert curated resource of biochemical reactions , 2017, Nucleic Acids Res..

[8]  Karsten Suhre Inference of gene function based on gene fusion events: the rosetta-stone method. , 2007, Methods in molecular biology.

[9]  A. Danchin,et al.  From a consortium sequence to a unified sequence: the Bacillus subtilis 168 reference genome a decade later , 2009, Microbiology.

[10]  Erin Beck,et al.  TIGRFAMs and Genome Properties in 2013 , 2012, Nucleic Acids Res..

[11]  A I Saeed,et al.  TM4: a free, open-source system for microarray data management and analysis. , 2003, BioTechniques.

[12]  Raymond Lo,et al.  Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database , 2015, Nucleic Acids Res..

[13]  Kai Blin,et al.  antiSMASH 4.0—improvements in chemistry prediction and gene cluster boundary identification , 2017, Nucleic Acids Res..

[14]  Carla S. Jones,et al.  Minimum Information about a Biosynthetic Gene cluster. , 2015, Nature chemical biology.

[15]  Jian Yang,et al.  VFDB 2016: hierarchical and refined dataset for big data analysis—10 years on , 2015, Nucleic Acids Res..

[16]  Raymond Lo,et al.  CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database , 2016, Nucleic Acids Res..

[17]  Chris F. Taylor,et al.  The minimum information about a genome sequence (MIGS) specification , 2008, Nature Biotechnology.

[18]  L. Aravind Guilt by association: contextual information in genome analysis. , 2000, Genome research.

[19]  Geoffrey J. Barton,et al.  Jalview Version 2—a multiple sequence alignment editor and analysis workbench , 2009, Bioinform..

[20]  Natalia N. Ivanova,et al.  Supporting community annotation and user collaboration in the integrated microbial genomes (IMG) system , 2016, BMC Genomics.

[21]  Stefan Engelen,et al.  MicroScope—an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data , 2012, Nucleic Acids Res..

[22]  Ole Lund,et al.  Real-Time Whole-Genome Sequencing for Routine Typing, Surveillance, and Outbreak Detection of Verotoxigenic Escherichia coli , 2014, Journal of Clinical Microbiology.

[23]  Vincent Miele,et al.  Ultra-fast sequence clustering from similarity networks with SiLiX , 2011, BMC Bioinformatics.

[24]  Stefan Engelen,et al.  MicroScope: a platform for microbial genome annotation and comparative genomics , 2009, Database J. Biol. Databases Curation.

[25]  Maxime Durot,et al.  Core and Panmetabolism in Escherichia coli , 2011, Journal of bacteriology.

[26]  Matthew Berriman,et al.  Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data , 2011, Bioinform..

[27]  Patrick Wincker,et al.  Unique features revealed by the genome sequence of Acinetobacter sp. ADP1, a versatile and naturally transformation competent bacterium. , 2004, Nucleic acids research.

[28]  Alexandre Renaux,et al.  MicroScope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes , 2016, Nucleic Acids Res..

[29]  Claudine Médigue,et al.  MICheck: a web tool for fast checking of syntactic annotations of bacterial genomes , 2005, Nucleic Acids Res..

[30]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[31]  M. Riley,et al.  MultiFun, a multifunctional classification scheme for Escherichia coli K-12 gene products. , 2000, Microbial & comparative genomics.

[32]  C. Médigue,et al.  MaGe: a microbial genome annotation system supported by synteny results , 2006, Nucleic acids research.

[33]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[34]  Peter D. Karp,et al.  Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology , 2016, Briefings Bioinform..

[35]  A. Danchin,et al.  Organised Genome Dynamics in the Escherichia coli Species Results in Highly Diverse Adaptive Paths , 2009, PLoS genetics.

[36]  Anne Morgat,et al.  An updated metabolic view of the Bacillus subtilis 168 genome. , 2013, Microbiology.

[37]  Rida Assaf,et al.  Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center , 2016, Nucleic Acids Res..

[38]  Eugene V. Koonin,et al.  Gene Frequency Distributions Reject a Neutral Model of Genome Evolution , 2013, Genome biology and evolution.

[39]  Ruben G. A. van Heck,et al.  The revisited genome of Pseudomonas putida KT2440 enlightens its value as a robust metabolic chassis. , 2016, Environmental microbiology.

[40]  Naryttza N. Diaz,et al.  The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes , 2005, Nucleic acids research.

[41]  Elisabeth Coudert,et al.  HAMAP in 2015: updates to the protein family classification and annotation system , 2014, Nucleic Acids Res..