ARGs‐OAP v2.0 with an expanded SARG database and Hidden Markov Models for enhancement characterization and quantification of antibiotic resistance genes in environmental metagenomes

Motivation Much global attention has been paid to antibiotic resistance in monitoring its emergence, accumulation and dissemination. For rapid characterization and quantification of antibiotic resistance genes (ARGs) in metagenomic datasets, an online analysis pipeline, ARGs‐OAP has been developed consisting of a database termed Structured Antibiotic Resistance Genes (the SARG) with a hierarchical structure (ARGs type‐subtype‐reference sequence). Results The new release of the database, termed SARG version 2.0, contains sequences not only from CARD and ARDB databases, but also carefully selected and curated sequences from the latest protein collection of the NCBI‐NR database, to keep up to date with the increasing number of ARG deposited sequences. SARG v2.0 has tripled the sequences of the first version and demonstrated improved coverage of ARGs detection in metagenomes from various environmental samples. In addition to annotation of high‐throughput raw reads using a similarity search strategy, ARGs‐OAP v2.0 now provides model‐based identification of assembled sequences using SARGfam, a high‐quality profile Hidden Markov Model (HMM), containing profiles of ARG subtypes. Additionally, ARGs‐OAP v2.0 improves cell number quantification by using the average coverage of essential single copy marker genes, as an option in addition to the previous method based on the 16S rRNA gene. Availability and implementation ARGs‐OAP can be accessed through http://smile.hku.hk/SARGs. The database could be downloaded from the same site. Source codes for this study can be downloaded from https://github.com/xiaole99/ARGs‐OAP‐v2.0.

[1]  Kara K. Tsang,et al.  Antimicrobial resistance surveillance in the genomic age , 2017, Annals of the New York Academy of Sciences.

[2]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[3]  Mihai Pop,et al.  ARDB—Antibiotic Resistance Genes Database , 2008, Nucleic Acids Res..

[4]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[5]  Robert D. Finn,et al.  The Pfam protein families database: towards a more sustainable future , 2015, Nucleic Acids Res..

[6]  Robert D. Stedtfeld,et al.  Virulence factor activity relationships (VFARs): a bioinformatics perspective. , 2017, Environmental science. Processes & impacts.

[7]  C. Carrillo,et al.  Genomic Tools for Customized Recovery and Detection of Foodborne Shiga Toxigenic Escherichia coli. , 2016, Journal of food protection.

[8]  Raymond Lo,et al.  CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database , 2016, Nucleic Acids Res..

[9]  Yong-guan Zhu,et al.  Application of genomic technologies to measure and monitor antibiotic resistance in animals , 2017, Annals of the New York Academy of Sciences.

[10]  P. Hauser,et al.  Environmental Science Processes & Impacts , 2018 .

[11]  Robert D. Stedtfeld,et al.  Antimicrobial resistance dashboard application for mapping environmental occurrence and resistant pathogens. , 2016, FEMS microbiology ecology.

[12]  Molly K. Gibson,et al.  Improved annotation of antibiotic resistance determinants reveals microbial resistomes cluster by ecology , 2014, The ISME Journal.

[13]  Alison S. Waller,et al.  Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data , 2012, PloS one.

[14]  Robert D. Finn,et al.  HMMER web server: 2015 update , 2015, Nucleic Acids Res..

[15]  Christina Boucher,et al.  MEGARes: an antimicrobial resistance database for high throughput sequencing , 2016, Nucleic Acids Res..

[16]  N. Woodford,et al.  Emergence of a new antibiotic resistance mechanism in India, Pakistan, and the UK: a molecular, biological, and epidemiological study , 2010, The Lancet. Infectious diseases.

[17]  Tong Zhang,et al.  ARGs-OAP: online analysis pipeline for antibiotic resistance genes detection from metagenomic data using an integrated structured ARG-database , 2016, Bioinform..

[18]  Bing Li,et al.  Exploring variation of antibiotic resistance genes in activated sludge over a four-year period through a metagenomic approach. , 2013, Environmental science & technology.

[19]  Owen White,et al.  The TIGRFAMs database of protein families , 2003, Nucleic Acids Res..

[20]  Robin Patel,et al.  Impact of Contaminating DNA in Whole-Genome Amplification Kits Used for Metagenomic Shotgun Sequencing for Infection Diagnosis , 2017, Journal of Clinical Microbiology.

[21]  M. Borodovsky,et al.  Ab initio gene identification in metagenomic sequences , 2010, Nucleic acids research.

[22]  Bruce Houghton,et al.  The Lancet Infectious Diseases , 2003 .

[23]  Andrew C. Pawlowski,et al.  The Comprehensive Antibiotic Resistance Database , 2013, Antimicrobial Agents and Chemotherapy.

[24]  Tong Zhang,et al.  Evaluation of a Hybrid Approach Using UBLAST and BLASTX for Metagenomic Sequences Annotation of Specific Functional Genes , 2014, PloS one.

[25]  J. Eisen,et al.  Systematic Identification of Gene Families for Use as “Markers” for Phylogenetic and Phylogeny-Driven Ecological Studies of Bacteria and Archaea and Their Major Subgroups , 2013, PloS one.

[26]  Bing Li,et al.  Metagenomic and network analysis reveal wide distribution and co-occurrence of environmental antibiotic resistance genes , 2015, The ISME Journal.

[27]  A. K. Singh,et al.  Mobile genes in the human microbiome are structured from global to individual scales , 2016, Nature.

[28]  Siu-Ming Yiu,et al.  IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth , 2012, Bioinform..

[29]  Bing Li,et al.  Antibiotic Resistance Genes and Correlations with Microbial Community and Metal Resistance Genes in Full-Scale Biogas Reactors As Revealed by Metagenomic Analysis. , 2017, Environmental science & technology.

[30]  Tandy Warnow,et al.  Profile Hidden Markov Models are Not Identifiable. , 2019, IEEE/ACM transactions on computational biology and bioinformatics.

[31]  Tong Zhang,et al.  Metagenomic analysis reveals significant changes of microbial compositions and protective functions during drinking water treatment , 2013, Scientific Reports.

[32]  Katherine S Pollard,et al.  Average genome size estimation improves comparative metagenomics and sheds light on the functional ecology of the human microbiome , 2015, Genome Biology.