Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection

Antimicrobial resistance (AMR) is a major threat to global public health because it makes standard treatments ineffective and contributes to the spread of infections. It is important to understand AMR’s biological mechanisms for the development of new drugs and more rapid and accurate clinical diagnostics. The increasing availability of whole-genome SNP (single nucleotide polymorphism) information, obtained from whole-genome sequence data, along with AMR profiles provides an opportunity to use feature selection in machine learning to find AMR-associated mutations. This work describes the use of a supervised feature selection approach using deep neural networks to detect AMR-associated genetic factors from whole-genome SNP data. The proposed method, DNP-AAP (deep neural pursuit – average activation potential), was tested on a Neisseria gonorrhoeae dataset with paired whole-genome sequence data and resistance profiles to five commonly used antibiotics including penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime. The results show that DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae, and also provide a list of candidate genomic features (SNPs) that might lead to the discovery of novel AMR determinants. Logistic regression classifiers were built with the identified SNPs and the prediction AUCs (area under the curve) for penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime were 0.974, 0.969, 0.949, 0.994, and 0.976, respectively. DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae. It also provides a list of candidate genes and intergenic regions that might lead to novel AMR factor discovery. More generally, DNP-AAP can be applied to AMR analysis of any bacterial species with genomic variants and phenotype data. It can serve as a useful screening tool for microbiologists to generate genetic candidates for further lab experiments.

[1]  S. Levy Active efflux, a common mechanism for biocide and antibiotic resistance. , 2002, Symposium series.

[2]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[3]  Molly K. Gibson,et al.  Improved annotation of antibiotic resistance determinants reveals microbial resistomes cluster by ecology , 2014, The ISME Journal.

[4]  J. Balcázar,et al.  Metagenomic analysis reveals that bacteriophages are reservoirs of antibiotic resistance genes. , 2016, International journal of antimicrobial agents.

[5]  C. Pommerenke,et al.  Genomewide Identification of Genetic Determinants of Antimicrobial Drug Resistance in Pseudomonas aeruginosa , 2009, Antimicrobial Agents and Chemotherapy.

[6]  Debaditya Roy,et al.  Feature selection using Deep Neural Networks , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[7]  C. Winterbourn Toxicity of iron and hydrogen peroxide: the Fenton reaction. , 1995, Toxicology letters.

[8]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[9]  Jerzy Tiuryn,et al.  GWAMAR: Genome-wide assessment of mutations associated with drug resistance in bacteria , 2014, BMC Genomics.

[10]  Carey-Ann D. Burnham,et al.  Evaluation of Machine Learning and Rules-Based Approaches for Predicting Antimicrobial Resistance Profiles in Gram-negative Bacilli from Whole Genome Sequence Data , 2016, Front. Microbiol..

[11]  Julian Parkhill,et al.  Genomic epidemiology of Neisseria gonorrhoeae with reduced susceptibility to cefixime in the USA: a retrospective observational study , 2014, The Lancet. Infectious diseases.

[12]  A. Cehovin,et al.  Mobile genetic elements in Neisseria gonorrhoeae: movement for change , 2017, Pathogens and disease.

[13]  Fangfang Xia,et al.  Machine Learning for Antimicrobial Resistance , 2016, 1607.01224.

[14]  G. Horsman,et al.  Genomic Epidemiology and Molecular Resistance Mechanisms of Azithromycin-Resistant Neisseria gonorrhoeae in Canada from 1997 to 2014 , 2016, Journal of Clinical Microbiology.

[15]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[16]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[17]  Fangfang Xia,et al.  Antimicrobial Resistance Prediction in PATRIC and RAST , 2016, Scientific Reports.

[18]  Daniel J. Wilson,et al.  Whole-genome sequencing to determine Neisseria gonorrhoeae transmission: an observational study , 2016 .

[19]  Gabor T. Marth,et al.  Haplotype-based variant detection from short-read sequencing , 2012, 1207.3907.

[20]  François Laviolette,et al.  Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons , 2016, BMC Genomics.

[21]  Mihai Pop,et al.  ARDB—Antibiotic Resistance Genes Database , 2008, Nucleic Acids Res..

[22]  Andrew C. Pawlowski,et al.  The Comprehensive Antibiotic Resistance Database , 2013, Antimicrobial Agents and Chemotherapy.

[23]  M. Duncan Characterization of Mechanisms of Antibiotic Resistance in Neisseria gonorrhoeae , 2012 .

[24]  Yu Zhang,et al.  Deep Neural Networks for High Dimension, Low Sample Size Data , 2017, IJCAI.

[25]  Maite Muniesa,et al.  Transfer of antibiotic-resistance genes via phage-related mobile elements. , 2015, Plasmid.

[26]  M. Unemo,et al.  Antimicrobial Resistance in Neisseria gonorrhoeae in the 21st Century: Past, Evolution, and Future , 2014, Clinical Microbiology Reviews.

[27]  Y. Grad,et al.  Genomic analyses of Neisseria gonorrhoeae reveal an association of the gonococcal genetic island with antimicrobial resistance , 2016, The Journal of infection.

[28]  Heng Li,et al.  A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data , 2011, Bioinform..

[29]  James Theiler,et al.  Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space , 2003, J. Mach. Learn. Res..

[30]  J. O'Neill,et al.  Tackling drug-resistant infections globally: final report and recommendations , 2016 .

[31]  I. King Jordan,et al.  Genome Sequence-Based Discriminator for Vancomycin-Intermediate Staphylococcus aureus , 2013, Journal of bacteriology.

[32]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[33]  T. Rudel,et al.  Role of pili and the phase-variable PilC protein in natural competence for transformation of Neisseria gonorrhoeae. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[35]  Yonatan H. Grad,et al.  WGS to predict antibiotic MICs for Neisseria gonorrhoeae , 2017, The Journal of antimicrobial chemotherapy.