The MAR databases: development and implementation of databases specific for marine metagenomics

Abstract We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/.

[1]  Daniel J. Nasko,et al.  VIROME: a standard operating procedure for analysis of viral metagenome sequences , 2012, Standards in genomic sciences.

[2]  F. Glöckner,et al.  Marine microbial genomics in Europe: current status and perspectives , 2010, Microbial biotechnology.

[3]  Philip D. Blood,et al.  Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software , 2017, Nature Methods.

[4]  Mark Johnson,et al.  NCBI BLAST: a better web interface , 2008, Nucleic Acids Res..

[5]  Christian Rinke,et al.  Impact of single-cell genomics and metagenomics on the emerging view of extremophile “microbial dark matter” , 2014, Extremophiles.

[6]  Graziano Pesole,et al.  The metagenomic data life-cycle: standards and best practices , 2017, GigaScience.

[7]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[8]  C. Claudel-Renard,et al.  Enzyme-specific profiles for genome annotation: PRIAM. , 2003, Nucleic acids research.

[9]  Rida Assaf,et al.  Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center , 2016, Nucleic Acids Res..

[10]  Matthew Fraser,et al.  InterProScan 5: genome-scale protein function classification , 2014, Bioinform..

[11]  Tanja Woyke,et al.  Genomic sequencing of single microbial cells from environmental samples. , 2008, Current opinion in microbiology.

[12]  T. Sicheritz-Pontén,et al.  Comparative performance of the BGISEQ-500 vs Illumina HiSeq2500 sequencing platforms for palaeogenomic sequencing , 2017, GigaScience.

[13]  T. Gojobori,et al.  Databases of the marine metagenomics. , 2016, Gene.

[14]  Robert D. Finn,et al.  EBI metagenomics in 2016 - an expanding and evolving resource for the analysis and archiving of metagenomic data , 2015, Nucleic Acids Res..

[15]  Bert W. Hoeksema,et al.  Global Coordination and Standardisation in Marine Biodiversity through the World Register of Marine Species (WoRMS) and Related Databases , 2013, PloS one.

[16]  Liza Gross,et al.  Untapped Bounty: Sampling the Seas to Survey Microbial Biodiversity , 2007, PLoS biology.

[17]  I-Min A. Chen,et al.  IMG/M: integrated genome and metagenome comparative data analysis system , 2016, Nucleic Acids Res..

[18]  Nikos Kyrpides,et al.  Genomes OnLine Database (GOLD) v.6: data updates and feature enhancements , 2016, Nucleic Acids Res..

[19]  Torsten Seemann,et al.  Prokka: rapid prokaryotic genome annotation , 2014, Bioinform..

[20]  T. Itoh,et al.  MetaGeneAnnotator: Detecting Species-Specific Patterns of Ribosomal Binding Site for Precise Gene Prediction in Anonymous Prokaryotic and Phage Genomes , 2008, DNA research : an international journal for rapid publication of reports on genes and genomes.

[21]  Pierre Taberlet,et al.  The ecologist's field guide to sequence‐based identification of biodiversity , 2016 .

[22]  A. Halpern,et al.  The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific , 2007, PLoS biology.

[23]  L. Laursen Spain's ship comes in , 2011, Nature.

[24]  Amrita Pati,et al.  Global metagenomic survey reveals a new bacterial candidate phylum in geothermal springs , 2016, Nature Communications.

[25]  Peter B. McGarvey,et al.  UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches , 2014, Bioinform..

[26]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[27]  Renzo Kottmann,et al.  Megx.net: integrated database resource for marine ecological genomics , 2009, Nucleic Acids Res..

[28]  Peer Bork,et al.  Computational eco-systems biology in Tara Oceans: translating data into knowledge , 2015, Molecular systems biology.

[29]  Kunihiko Sadakane,et al.  MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph , 2014, Bioinform..

[30]  Samuel V. Angiuoli,et al.  Toward an online repository of Standard Operating Procedures (SOPs) for (meta)genomic annotation. , 2008, Omics : a journal of integrative biology.

[31]  J. Gilbert,et al.  Metagenomes and metatranscriptomes from the L4 long-term coastal monitoring station in the Western English Channel , 2010, Standards in genomic sciences.

[32]  Alexandre Renaux,et al.  MicroScope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes , 2016, Nucleic Acids Res..

[33]  L. Artigas,et al.  A global census of marine microbes , 2010 .

[34]  Andreas Wilke,et al.  phylogenetic and functional analysis of metagenomes , 2022 .

[35]  Mary Ann Moran,et al.  The Amazon continuum dataset: quantitative metagenomic and metatranscriptomic inventories of the Amazon River plume, June 2010 , 2014, Microbiome.