Deep learning improves antimicrobial peptide recognition

Abstract Motivation Bacterial resistance to antibiotics is a growing concern. Antimicrobial peptides (AMPs), natural components of innate immunity, are popular targets for developing new drugs. Machine learning methods are now commonly adopted by wet-laboratory researchers to screen for promising candidates. Results In this work, we utilize deep learning to recognize antimicrobial activity. We propose a neural network model with convolutional and recurrent layers that leverage primary sequence composition. Results show that the proposed model outperforms state-of-the-art classification models on a comprehensive dataset. By utilizing the embedding weights, we also present a reduced-alphabet representation and show that reasonable AMP recognition can be maintained using nine amino acid types. Availability and implementation Models and datasets are made freely available through the Antimicrobial Peptide Scanner vr.2 web server at www.ampscanner.com. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[2]  Gajendra P. S. Raghava,et al.  AntiBP2: improved version of antibacterial peptide prediction , 2010, BMC Bioinformatics.

[3]  Morteza Mohammad Noori,et al.  Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features , 2014, PLoS Comput. Biol..

[4]  Amarda Shehu,et al.  A novel method to improve recognition of antimicrobial peptides through distal sequence-based features , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[5]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[6]  Maliha Aziz,et al.  Staphylococcus aureus CC398: Host Adaptation and Emergence of Methicillin Resistance in Livestock , 2012, mBio.

[7]  Chandler Davis The norm of the Schur product operation , 1962 .

[8]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[9]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[10]  Xavier Robin,et al.  pROC: an open-source package for R and S+ to analyze and compare ROC curves , 2011, BMC Bioinformatics.

[11]  R. Epand,et al.  Molecular mechanisms of membrane targeting antibiotics. , 2016, Biochimica et biophysica acta.

[12]  R. Russell,et al.  Amino‐Acid Properties and Consequences of Substitutions , 2003 .

[13]  Artem Cherkasov,et al.  Application of 'inductive' QSAR descriptors for quantification of antibacterial activity of cationic polypeptides. , 2004, Molecules.

[14]  Simon King,et al.  IEEE Workshop on automatic speech recognition and understanding , 2009 .

[15]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[16]  Ole Winther,et al.  Convolutional LSTM Networks for Subcellular Localization of Proteins , 2015, AlCoB.

[17]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[18]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[19]  Daniel J Rigden,et al.  Prediction of antimicrobial peptides based on the adaptive neuro-fuzzy inference system application. , 2012, Biopolymers.

[20]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[21]  Paul S. Russo,et al.  Bioprospecting the American Alligator (Alligator mississippiensis) Host Defense Peptidome , 2015, PloS one.

[22]  Prabina Kumar Meher,et al.  Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general PseAAC , 2017, Scientific Reports.

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  Taeho Jo,et al.  Improving Protein Fold Recognition by Deep Learning Networks , 2015, Scientific Reports.

[25]  K. Chou,et al.  iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types. , 2013, Analytical biochemistry.

[26]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[27]  Amarda Shehu,et al.  Improving Recognition of Antimicrobial Peptides and Target Selectivity through Machine Learning and Genetic Programming , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[28]  Amarda Shehu,et al.  Binary Response Models for Recognition of Antimicrobial Peptides , 2013, BCB.

[29]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[30]  C. Fjell,et al.  Identification of novel antibacterial peptides by chemoinformatics and machine learning. , 2009, Journal of medicinal chemistry.

[31]  K. De Jong,et al.  Effective Automated Feature Construction and Selection for Classification of Biological Sequences , 2014, PloS one.

[32]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[33]  Jianlin Cheng,et al.  A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[34]  Ying Gao,et al.  Bioinformatics Applications Note Sequence Analysis Cd-hit Suite: a Web Server for Clustering and Comparing Biological Sequences , 2022 .

[35]  Alessandro Tossi,et al.  Evolution of the Primate Cathelicidin , 2006, Journal of Biological Chemistry.

[36]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[37]  M. Barnes,et al.  Bioinformatics for geneticists. , 2003 .

[38]  H. G. Boman,et al.  Antibacterial peptides: basic facts and emerging concepts , 2003, Journal of internal medicine.

[39]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[40]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[41]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[42]  Michele Magrane,et al.  UniProt Knowledgebase: a hub of integrated protein data , 2011, Database J. Biol. Databases Curation.

[43]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[44]  Navdeep Jaitly,et al.  Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[45]  Daniel Veltri,et al.  A Computational and Statistical Framework for Screening Novel Antimicrobial Peptides , 2015 .

[46]  Shreyas Karnik,et al.  CAMP: a useful resource for research on antimicrobial peptides , 2009, Nucleic Acids Res..

[47]  William C. Wimley,et al.  Antimicrobial Peptides: Successes, Challenges and Unanswered Questions , 2011, The Journal of Membrane Biology.

[48]  Guangshun Wang,et al.  Antimicrobial peptides: discovery, design and novel therapeutic strategies. , 2010 .

[49]  Xia Li,et al.  APD3: the antimicrobial peptide database as a tool for research and education , 2015, Nucleic Acids Res..

[50]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[51]  Artem Cherkasov,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm068 Databases and ontologies AMPer: a database and an automated discovery tool for antimicrobial peptides , 2022 .

[52]  R. L. Thorndike Who belongs in the family? , 1953 .

[53]  Andrew L. Ferguson,et al.  Mapping membrane activity in undiscovered peptide sequence space using machine learning , 2016, Proceedings of the National Academy of Sciences.

[54]  Gajendra P. S. Raghava,et al.  Analysis and prediction of antibacterial peptides , 2007, BMC Bioinformatics.

[55]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[56]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[57]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[58]  Marc Torrent,et al.  Connecting Peptide Physicochemical and Antimicrobial Properties by a Rational Prediction Model , 2011, PloS one.