A novel method to improve recognition of antimicrobial peptides through distal sequence-based features

Growing bacterial resistance to antibiotics is urging the development of new lines of treatment. The discovery of naturally-occurring antimicrobial peptides (AMPs) is motivating many experimental and computational researchers to pursue AMPs as possible templates. In the experimental community, the focus is generally on systematic point mutation studies to measure the effect on antibacterial activity. In the computational community, the goal is to understand what determines such activity in a machine learning setting. In the latter, it is essential to identify biological signals or features in AMPs that are predictive of antibacterial activity. Construction of effective features has proven challenging. In this paper, we advance research in this direction. We propose a novel method to construct and select complex sequence-based features able to capture information about distal patterns within a peptide. Thorough comparative analysis in this paper indicates that such features compete with the state-of-the-art in AMP recognition while providing transparent summarizations of antibacterial activity at the sequence level. We demonstrate that these features can be combined with additional physicochemical features of interest to a biological researcher to facilitate specific AMP design or modification in the wet laboratory. Code, data, results, and analysis accompanying this paper are publicly available online at: http://cs.gmu.edu/~ashehu/?q=OurTools.

[1]  Daniel J Rigden,et al.  Prediction of antimicrobial peptides based on the adaptive neuro-fuzzy inference system application. , 2012, Biopolymers.

[2]  G. Schneider,et al.  Designing antimicrobial peptides: form follows function , 2011, Nature Reviews Drug Discovery.

[3]  Haruki Nakamura,et al.  Announcing the worldwide Protein Data Bank , 2003, Nature Structural Biology.

[4]  Gisbert Schneider,et al.  Designing antimicrobial peptides: form follows function , 2012, Nature Reviews Drug Discovery.

[5]  Kenneth A. De Jong,et al.  An Evolutionary Algorithm Approach for Feature Generation from Sequence Data and Its Application to DNA Splice Site Prediction , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  K. Chou,et al.  iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types. , 2013, Analytical biochemistry.

[7]  Hiroyuki Ogata,et al.  AAindex: Amino Acid Index Database , 1999, Nucleic Acids Res..

[8]  L. Kier,et al.  Amino acid side chain parameters for correlation studies in biology and pharmacology. , 2009, International journal of peptide and protein research.

[9]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[10]  Artem Cherkasov,et al.  Application of 'inductive' QSAR descriptors for quantification of antibacterial activity of cationic polypeptides. , 2004, Molecules.

[11]  R. Aurora,et al.  Helix capping , 1998, Protein science : a publication of the Protein Society.

[12]  Theresa Braine,et al.  Race against time to develop new antibiotics. , 2011, Bulletin of the World Health Organization.

[13]  Artem Cherkasov,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm068 Databases and ontologies AMPer: a database and an automated discovery tool for antimicrobial peptides , 2022 .

[14]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[15]  Amarda Shehu,et al.  Physicochemical Determinants of Antimicrobial Activity , 2012 .

[16]  Zhe Wang,et al.  APD: the Antimicrobial Peptide Database , 2004, Nucleic Acids Res..

[17]  S Rackovsky,et al.  Optimized representations and maximal information in proteins , 2000, Proteins.

[18]  Jaap Heringa,et al.  An analysis of protein domain linkers: their classification and role in protein folding. , 2002, Protein engineering.

[19]  R. Hancock,et al.  Host defence peptides from invertebrates--emerging antimicrobial strategies. , 2006, Immunobiology.

[20]  K. Chou,et al.  Prediction of Antimicrobial Peptides Based on Sequence Alignment and Feature Selection Methods , 2011, PloS one.

[21]  Gajendra P. S. Raghava,et al.  Analysis and prediction of antibacterial peptides , 2007, BMC Bioinformatics.

[22]  T Tsujita,et al.  Dependence of conformational stability on hydrophobicity of the amino acid residue in a series of variant proteins substituted at a unique position of tryptophan synthase alpha subunit. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[23]  O. Ptitsyn,et al.  Physical reasons for secondary structure stability: α‐Helices in short peptides , 1991 .

[24]  Guangshun Wang,et al.  Antimicrobial peptides: discovery, design and novel therapeutic strategies. , 2010 .

[25]  Amarda Shehu,et al.  Binary Response Models for Recognition of Antimicrobial Peptides , 2013, BCB.

[26]  Fabiano C. Fernandes,et al.  An SVM Model Based on Physicochemical Properties to Predict Antimicrobial Activity from Protein Sequences with Cysteine Knot Motifs , 2010, BSB.

[27]  L. Serrano,et al.  Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins , 2004, Nature Biotechnology.

[28]  H. G. Boman,et al.  Antibacterial peptides: basic facts and emerging concepts , 2003, Journal of internal medicine.

[29]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[30]  Gajendra P. S. Raghava,et al.  AntiBP2: improved version of antibacterial peptide prediction , 2010, BMC Bioinformatics.

[31]  S. Solomon,et al.  Antibiotic resistance threats in the United States: stepping back from the brink. , 2014, American family physician.

[32]  Marc Torrent,et al.  Connecting Peptide Physicochemical and Antimicrobial Properties by a Rational Prediction Model , 2011, PloS one.

[33]  C. Fjell,et al.  Identification of novel antibacterial peptides by chemoinformatics and machine learning. , 2009, Journal of medicinal chemistry.

[34]  K. De Jong,et al.  Effective Automated Feature Construction and Selection for Classification of Biological Sequences , 2014, PloS one.

[35]  P. Kinnunen,et al.  Binding of amphipathic alpha-helical antimicrobial peptides to lipid membranes: lessons from temporins B and L. , 2009, Biochimica et biophysica acta.

[36]  Alessandro Tossi,et al.  Amphipathic α helical antimicrobial peptides. , 2001 .

[37]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[38]  Shreyas Karnik,et al.  CAMP: a useful resource for research on antimicrobial peptides , 2009, Nucleic Acids Res..

[39]  Michele Magrane,et al.  UniProt Knowledgebase: a hub of integrated protein data , 2011, Database J. Biol. Databases Curation.

[40]  Amarda Shehu,et al.  Systematic analysis of global features and model building for recognition of antimicrobial peptides , 2013, 2013 IEEE 3rd International Conference on Computational Advances in Bio and medical Sciences (ICCABS).