PARGT: a software tool for predicting antimicrobial resistance in bacteria

With the ever-increasing availability of whole-genome sequences, machine-learning approaches can be used as an alternative to traditional alignment-based methods for identifying new antimicrobial-resistance genes. Such approaches are especially helpful when pathogens cannot be cultured in the lab. In previous work, we proposed a game-theory-based feature evaluation algorithm. When using the protein characteristics identified by this algorithm, called ‘features’ in machine learning, our model accurately identified antimicrobial resistance (AMR) genes in Gram-negative bacteria. Here we extend our study to Gram-positive bacteria showing that coupling game-theory-identified features with machine learning achieved classification accuracies between 87% and 90% for genes encoding resistance to the antibiotics bacitracin and vancomycin . Importantly, we present a standalone software tool that implements the game-theory algorithm and machine-learning model used in these studies.

[1]  S. Rasmussen,et al.  Identification of acquired antimicrobial resistance genes , 2012, The Journal of antimicrobial chemotherapy.

[2]  Raymond Lo,et al.  CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database , 2016, Nucleic Acids Res..

[3]  Leopold Parts,et al.  Prediction of antibiotic resistance in Escherichia coli from large-scale pan-genome data , 2018, PLoS Comput. Biol..

[4]  Lenwood S. Heath,et al.  DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data , 2017, bioRxiv.

[5]  G. Cochrane,et al.  Global monitoring of antimicrobial resistance based on metagenomics analyses of urban sewage , 2019, Nature Communications.

[6]  A. Smaal,et al.  Oyster breakwater reefs promote adjacent mudflat stability and salt marsh growth in a monsoon dominated subtropical coast , 2019, Scientific Reports.

[7]  M. Webber,et al.  Molecular mechanisms of antibiotic resistance , 2014, Nature Reviews Microbiology.

[8]  Justin Zobel,et al.  SRST2: Rapid genomic surveillance for public health and hospital microbiology labs , 2014, bioRxiv.

[9]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[10]  Abu Sayed Chowdhury,et al.  Capreomycin resistance prediction in two species of Mycobacterium using a stacked ensemble method , 2019, Journal of applied microbiology.

[11]  D. Maskell,et al.  Search Engine for Antimicrobial Resistance: A Cloud Compatible Pipeline and Web Interface for Rapidly Detecting Antimicrobial Resistance Genes Directly from Sequence Data , 2015, PloS one.

[12]  S. Bhattacharyya,et al.  Antibiotic drug-resistance as a complex system driven by socio-economic growth and antibiotic misuse , 2019, Scientific Reports.

[13]  Geoffrey I. Webb,et al.  POSSUM: a bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles , 2017, Bioinform..

[14]  Gautam Dantas,et al.  Sequencing-based methods and resources to study antimicrobial resistance , 2019, Nature Reviews Genetics.

[15]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[16]  J. Rolain,et al.  ARG-ANNOT, a New Bioinformatic Tool To Discover Antibiotic Resistance Genes in Bacterial Genomes , 2013, Antimicrobial Agents and Chemotherapy.

[17]  Rida Assaf,et al.  Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center , 2016, Nucleic Acids Res..

[18]  J. Daily,et al.  Whole Proteome Clustering of 2,307 Proteobacterial Genomes Reveals Conserved Proteins and Significant Annotation Issues , 2019, Front. Microbiol..

[19]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[20]  Erik L. L. Sonnhammer,et al.  Kalign – an accurate and fast multiple sequence alignment algorithm , 2005, BMC Bioinformatics.

[21]  B. Limbago,et al.  SSTAR, a Stand-Alone Easy-To-Use Antimicrobial Resistance Gene Predictor , 2016, mSphere.

[22]  Fangfang Xia,et al.  Antimicrobial Resistance Prediction in PATRIC and RAST , 2016, Scientific Reports.

[23]  Shira L. Broschat,et al.  Prediction of T4SS Effector Proteins for Anaplasma phagocytophilum Using OPT4e, A New Software Tool , 2019, Front. Microbiol..

[24]  Z. R. Li,et al.  PROFEAT Update: A Protein Features Web Server with Added Facility to Compute Network Descriptors for Studying Omics-Derived Networks. , 2017, Journal of molecular biology.

[25]  Mihai Pop,et al.  ARDB—Antibiotic Resistance Genes Database , 2008, Nucleic Acids Res..

[26]  Abdollah Dehzangi,et al.  Protein Fold Recognition Using Genetic Algorithm Optimized Voting Scheme and Profile Bigram , 2016, J. Softw..

[27]  H. Piaggio Mathematical Analysis , 1955, Nature.

[28]  P. Gilligan,et al.  Defining antimicrobial resistance in cystic fibrosis. , 2018, Journal of cystic fibrosis : official journal of the European Cystic Fibrosis Society.

[29]  I. Muchnik,et al.  Prediction of protein folding class using global description of amino acid sequence. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[30]  I. Muchnik,et al.  Recognition of a protein fold in the context of the SCOP classification , 1999 .

[31]  E. Lofgren,et al.  Identifying predictors of antimicrobial exposure in hospitalized patients using a machine learning approach , 2019, Journal of applied microbiology.

[32]  Aldenor G. Santos,et al.  Occurrence of the potent mutagens 2- nitrobenzanthrone and 3-nitrobenzanthrone in fine airborne particles , 2019, Scientific Reports.

[33]  U. Hofer The cost of antimicrobial resistance , 2018, Nature Reviews Microbiology.

[34]  I. Muchnik,et al.  Recognition of a protein fold in the context of the Structural Classification of Proteins (SCOP) classification. , 1999, Proteins.

[35]  Weiping Chen,et al.  A protein network descriptor server and its use in studying protein, disease, metabolic and drug targeted networks , 2016, Briefings Bioinform..

[36]  Dong-Sheng Cao,et al.  protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences , 2015, Bioinform..

[37]  G. Tillotson,et al.  Burden of antimicrobial resistance in an era of decreasing susceptibility , 2017, Expert review of anti-infective therapy.

[38]  M. Tunney,et al.  Antimicrobial resistance in the respiratory microbiota of people with cystic fibrosis , 2014, The Lancet.

[39]  Malbert R. C. Rogers,et al.  University of Birmingham Prediction of the intestinal resistome by a three-dimensional structure-based method , 2018 .

[40]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[41]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[42]  N. Low,et al.  Antimicrobial resistance prediction and phylogenetic analysis of Neisseria gonorrhoeae isolates using the Oxford Nanopore MinION sequencer , 2018, Scientific Reports.

[43]  Abu Sayed Chowdhury,et al.  Antimicrobial Resistance Prediction for Gram-Negative Bacteria via Game Theory-Based Feature Evaluation , 2019, Scientific Reports.

[44]  Julian Parkhill,et al.  ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads , 2017, bioRxiv.