Better understanding and prediction of antiviral peptides through primary and secondary structure feature importance

The emergence of viral epidemics throughout the world is of concern due to the scarcity of available effective antiviral therapeutics. The discovery of new antiviral therapies is imperative to address this challenge, and antiviral peptides (AVPs) represent a valuable resource for the development of novel therapies to combat viral infection. We present a new machine learning model to distinguish AVPs from non-AVPs using the most informative features derived from the physicochemical and structural properties of their amino acid sequences. To focus on those features that are most likely to contribute to antiviral performance, we filter potential features based on their importance for classification. These feature selection analyses suggest that secondary structure is the most important peptide sequence feature for predicting AVPs. Our Feature-Informed Reduced Machine Learning for Antiviral Peptide Prediction (FIRM-AVP) approach achieves a higher accuracy than either the model with all features or current state-of-the-art single classifiers. Understanding the features that are associated with AVP activity is a core need to identify and design new AVPs in novel systems. The FIRM-AVP code and standalone software package are available at https://github.com/pmartR/FIRM-AVP with an accompanying web application at https://msc-viz.emsl.pnnl.gov/AVPR.

[1]  Alessandro Tossi,et al.  Amphipathic, α‐helical antimicrobial peptides , 2000 .

[2]  W. Prusoff,et al.  Approaches to antiviral drug development. , 1989, The Yale journal of biology and medicine.

[3]  Guillaume Castel,et al.  Phage Display of Combinatorial Peptide Libraries: Application to Antiviral Research , 2011, Molecules.

[4]  Alon Herschhorn,et al.  Inhibition of the activities of reverse transcriptase and integrase of human immunodeficiency virus type-1 by peptides derived from the homologous viral protein R (Vpr). , 2007, Journal of molecular biology.

[5]  Yong-tang Zheng,et al.  Current Peptide HIV Type-1 Fusion Inhibitors , 2009, Antiviral chemistry & chemotherapy.

[6]  Manoj Kumar,et al.  AVPpred: collection and prediction of highly effective antiviral peptides , 2012, Nucleic Acids Res..

[7]  Xia Li,et al.  APD3: the antimicrobial peptide database as a tool for research and education , 2015, Nucleic Acids Res..

[8]  Mohammad Tausiful Islam,et al.  SNG and DNG meta-absorber with fractional absorption band for sensing application , 2020, Scientific Reports.

[9]  Manoj Kumar,et al.  AVPdb: a database of experimentally validated antiviral peptides targeting medically important viruses , 2013, Nucleic Acids Res..

[10]  A. Tossi,et al.  Alpha-helical antimicrobial peptides--using a sequence template to guide structure-activity relationship studies. , 2006, Biochimica et biophysica acta.

[11]  Kuan Y. Chang,et al.  Analysis and Prediction of Highly Effective Antiviral Peptides Based on Random Forests , 2013, PloS one.

[12]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics, ProbabilityTheory Group (Formerly: E1071), TU Wien , 2015 .

[13]  B. Oberg,et al.  Achievements and Challenges in Antiviral Drug Discovery , 2005, Antiviral chemistry & chemotherapy.

[14]  J. Louis,et al.  Hydrophilic peptides derived from the transframe region of Gag-Pol inhibit the HIV-1 protease. , 1998, Biochemistry.

[15]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[16]  A. Smaal,et al.  Oyster breakwater reefs promote adjacent mudflat stability and salt marsh growth in a monsoon dominated subtropical coast , 2019, Scientific Reports.

[17]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[18]  Kuo-Chen Chou,et al.  Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes , 2005, Bioinform..

[19]  Mandana Behbahani,et al.  Using Chou’s Pseudo Amino Acid Composition and Machine LearningMethod to Predict the Antiviral Peptides , 2015 .

[20]  Abid Qureshi,et al.  AVP‐IC50Pred: Multiple machine learning techniques‐based prediction of peptide antiviral activity in terms of half maximal inhibitory concentration (IC50) , 2015, Biopolymers.

[21]  E. Domingo Mechanisms of viral emergence , 2010, Veterinary research.

[22]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[23]  Jorge Félix Beltrán Lissabet,et al.  AntiVPP 1.0: A portable tool for prediction of antiviral peptides , 2019, Computers in Biology and Medicine.

[24]  L. Jiang,et al.  PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..

[25]  Alon Herschhorn,et al.  Peptides Derived from the Reverse Transcriptase of Human Immunodeficiency Virus Type 1 as Novel Inhibitors of the Viral Integrase* , 2005, Journal of Biological Chemistry.

[26]  Víctor Urrea,et al.  Letter to the Editor: Stability of Random Forest importance measures , 2011, Briefings Bioinform..

[27]  Abu Sayed Chowdhury,et al.  Capreomycin resistance prediction in two species of Mycobacterium using a stacked ensemble method , 2019, Journal of applied microbiology.

[28]  I. Muchnik,et al.  Prediction of protein folding class using global description of amino acid sequence. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Yibing Huang,et al.  Alpha-helical cationic antimicrobial peptides: relationships of structure and function , 2010, Protein & Cell.

[30]  Erik S. Wright,et al.  Using DECIPHER v2.0 to Analyze Big Biological Sequence Data in R , 2016, R J..

[31]  N. Nathanson,et al.  Chapter 16 Emerging Viral Diseases Why We Need to Worry about Bats , Camels , and Airplanes , 2015 .

[32]  Ran Su,et al.  PEPred-Suite: improved and robust prediction of therapeutic peptides using adaptive feature representation learning , 2019, Bioinform..

[33]  Abu Sayed Chowdhury,et al.  Antimicrobial Resistance Prediction for Gram-Negative Bacteria via Game Theory-Based Feature Evaluation , 2019, Scientific Reports.

[34]  Z. R. Li,et al.  Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..

[35]  Virapong Prachayasittikul,et al.  Meta-iAVP: A Sequence-Based Meta-Predictor for Improving the Prediction of Antiviral Peptides Using Effective Feature Representation , 2019, International journal of molecular sciences.

[36]  Shira L. Broschat,et al.  PARGT: a software tool for predicting antimicrobial resistance in bacteria , 2020, Scientific Reports.

[37]  K. M. Hwang,et al.  Peptides Derived from the CDR3‐Homologous Domain of the CD4 Molecule Are Specific Inhibitors of HIV‐1 and SIV Infection, Virus‐Induced Cell Fusion, and Postinfection Viral Transmission in Vitro , 1990, Annals of the New York Academy of Sciences.

[38]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[39]  T. Phan,et al.  Genetic diversity and evolution of SARS-CoV-2 , 2020, Infection, Genetics and Evolution.

[40]  Faiza Hanif Waghu,et al.  CAMPR3: a database on sequences, structures and signatures of antimicrobial peptides , 2015, Nucleic Acids Res..

[41]  M. L. Campos,et al.  Antiviral peptides as promising therapeutic drugs , 2019, Cellular and Molecular Life Sciences.

[42]  I. Muchnik,et al.  Recognition of a protein fold in the context of the SCOP classification , 1999 .

[43]  A. Tossi,et al.  Amphipathic alpha helical antimicrobial peptides. , 2001, European journal of biochemistry.

[44]  Abid Qureshi,et al.  AVCpred: an integrated web server for prediction and design of antiviral compounds , 2016, Chemical biology & drug design.

[45]  Dong-Sheng Cao,et al.  protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences , 2015, Bioinform..

[46]  M. Kumar,et al.  HIPdb: A Database of Experimentally Validated HIV Inhibiting Peptides , 2013, PloS one.

[47]  Aldenor G. Santos,et al.  Occurrence of the potent mutagens 2- nitrobenzanthrone and 3-nitrobenzanthrone in fine airborne particles , 2019, Scientific Reports.

[48]  M. Mildner,et al.  Re-epithelialization and immune cell behaviour in an ex vivo human skin model , 2020, Scientific Reports.