AIPpred: Sequence-Based Prediction of Anti-inflammatory Peptides Using Random Forest

The use of therapeutic peptides in various inflammatory diseases and autoimmune disorders has received considerable attention; however, the identification of anti-inflammatory peptides (AIPs) through wet-lab experimentation is expensive and often time consuming. Therefore, the development of novel computational methods is needed to identify potential AIP candidates prior to in vitro experimentation. In this study, we proposed a random forest (RF)-based method for predicting AIPs, called AIPpred (AIP predictor in primary amino acid sequences), which was trained with 354 optimal features. First, we systematically studied the contribution of individual composition [amino acid-, dipeptide composition (DPC), amino acid index, chain-transition-distribution, and physicochemical properties] in AIP prediction. Since the performance of the DPC-based model is significantly better than that of other composition-based models, we applied a feature selection protocol on this model and identified the optimal features. AIPpred achieved an area under the curve (AUC) value of 0.801 in a 5-fold cross-validation test, which was ∼2% higher than that of the control RF predictor trained with all DPC composition features, indicating the efficiency of the feature selection protocol. Furthermore, we evaluated the performance of AIPpred on an independent dataset, with results showing that our method outperformed an existing method, as well as 3 different machine learning methods developed in this study, with an AUC value of 0.814. These results indicated that AIPpred will be a useful tool for predicting AIPs and might efficiently assist the development of AIP therapeutics and biomedical research. AIPpred is freely accessible at www.thegleelab.org/AIPpred.

[1]  C. Balagué,et al.  Understanding autoimmune disease: new targets for drug discovery. , 2009, Drug discovery today.

[2]  Sangdun Choi,et al.  Toll-like receptor modulators: a patent review (2006 – 2010) , 2011, Expert opinion on therapeutic patents.

[3]  William F Porto,et al.  Antimicrobial activity predictors benchmarking analysis using shuffled and designed synthetic peptides. , 2017, Journal of theoretical biology.

[4]  Kyung-Soo Hahm,et al.  Cell specificity, anti-inflammatory activity, and plausible bactericidal mechanism of designed Trp-rich model antimicrobial peptides. , 2009, Biochimica et biophysica acta.

[5]  K. Chou,et al.  iACP: a sequence-based tool for identifying anticancer peptides , 2016, Oncotarget.

[6]  Wei Chen,et al.  Prediction of cell-penetrating peptides with feature selection techniques. , 2016, Biochemical and biophysical research communications.

[7]  Amy Huei-Yi Lee,et al.  Mechanisms of the Innate Defense Regulator Peptide-1002 Anti-Inflammatory Activity in a Sterile Inflammation Mouse Model , 2017, The Journal of Immunology.

[8]  Sangdun Choi,et al.  Molecular Modeling-Based Evaluation of hTLR10 and Identification of Potential Ligands in Toll-Like Receptor Signaling , 2010, PloS one.

[9]  R. Medzhitov Origin and physiological roles of inflammation , 2008, Nature.

[10]  Gajendra PS Raghava,et al.  Identification of B-cell epitopes in an antigen for inducing specific class of antibodies , 2013, Biology Direct.

[11]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[12]  Sangdun Choi,et al.  Structure-Function Relationship of Cytoplasmic and Nuclear IκB Proteins: An In Silico Analysis , 2010, PloS one.

[13]  Kumardeep Chaudhary,et al.  Computer-Aided Virtual Screening and Designing of Cell-Penetrating Peptides. , 2015, Methods in molecular biology.

[14]  Sara Silva,et al.  A Comparison of Machine Learning Methods for the Prediction of Breast Cancer , 2011, EvoBio.

[15]  Miao Sun,et al.  QAcon: single model quality assessment using protein structural and contact information with machine learning techniques , 2016, Bioinform..

[16]  Timothy K Lu,et al.  Antimicrobial peptides: Role in human disease and potential as immunotherapies , 2017, Pharmacology & therapeutics.

[17]  K. Chou,et al.  iRNA-PseColl: Identifying the Occurrence Sites of Different RNA Modifications by Incorporating Collective Effects of Nucleotides into PseKNC , 2017, Molecular therapy. Nucleic acids.

[18]  Keehyoung Joo,et al.  Sigma-RF: prediction of the variability of spatial restraints in template-based modeling by random forest , 2015, BMC Bioinformatics.

[19]  C. Lloyd,et al.  Chronic inflammation and asthma , 2010, Mutation research.

[20]  O A Iakimenko,et al.  [Anti-inflammatory agents]. , 1984, Fel'dsher i akusherka.

[21]  Ettore Novellino,et al.  Design and Synthesis of Melanocortin Peptides with Candidacidal and Anti-TNF-α Properties , 2005 .

[22]  Balachandran Manavalan,et al.  Random Forest-Based Protein Model Quality Assessment (RFMQA) Using Structural Features and Potential Energy Terms , 2014, PloS one.

[23]  S. Gorr,et al.  Design and Validation of Anti-inflammatory Peptides from Human Parotid Secretory Protein , 2005, Journal of dental research.

[24]  Hua Tang,et al.  Identification of Bacterial Cell Wall Lyases via Pseudo Amino Acid Composition , 2016, BioMed research international.

[25]  O L Franco,et al.  Computational tools for exploring sequence databases as a resource for antimicrobial peptides. , 2017, Biotechnology advances.

[26]  Qing Zhang,et al.  Immune epitope database analysis resource (IEDB-AR) , 2008, Nucleic Acids Res..

[27]  Bjoern Peters,et al.  The Immune Epitope Database and Analysis Resource in Epitope Discovery and Synthetic Vaccine Design , 2017, Front. Immunol..

[28]  Sangdun Choi,et al.  Comparative Analysis of Species-Specific Ligand Recognition in Toll-Like Receptor 8 Signaling: A Hypothesis , 2011, PloS one.

[29]  Wei Chen,et al.  Identification of bacteriophage virion proteins by the ANOVA feature selection and analysis. , 2014, Molecular bioSystems.

[30]  Lawrence Steinman,et al.  Optimization of current and future therapy for autoimmune diseases , 2012, Nature Medicine.

[31]  Balachandran Manavalan,et al.  DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest , 2017, bioRxiv.

[32]  Taeho Jo,et al.  Evaluation of Protein Structural Models Using Random Forests , 2016, ArXiv.

[33]  Balachandran Manavalan,et al.  MLACP: machine-learning-based prediction of anticancer peptides , 2017, Oncotarget.

[34]  Jooyoung Lee,et al.  SVMQA: support‐vector‐machine‐based protein single‐model quality assessment , 2017, Bioinform..

[35]  Y Chen,et al.  Orally administered RDP58 reduces the severity of dextran sodium sulphate induced colitis , 2002, Annals of the rheumatic diseases.

[36]  Sangdun Choi,et al.  Evolutionary, Structural and Functional Interplay of the IκB Family Members , 2013, PloS one.

[37]  Khairullah Khan,et al.  A Review of Machine Learning Algorithms for Text-Documents Classification , 2010 .

[38]  Xin Yan,et al.  Effects of antimicrobial peptide L-K6, a temporin-1CEb analog on oral pathogen growth, Streptococcus mutans biofilm formation, and anti-inflammatory activity , 2014, Applied Microbiology and Biotechnology.

[39]  Richard Dobson,et al.  A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies , 2013, Statistical methods in medical research.

[40]  Wei Chen,et al.  iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition , 2013, Nucleic acids research.

[41]  Jiangning Song,et al.  SOHPRED: a new bioinformatics tool for the characterization and prediction of human S-sulfenylation sites. , 2016, Molecular bioSystems.

[42]  Wei Chen,et al.  iRNA-AI: identifying the adenosine to inosine editing sites in RNA sequences , 2016, Oncotarget.

[43]  Jooyoung Lee,et al.  Improved network community structure improves function prediction , 2013, Scientific Reports.

[44]  Pierre Baldi,et al.  SOLpro: accurate sequence-based prediction of protein solubility , 2009, Bioinform..

[45]  Renzhi Cao,et al.  Protein single-model quality assessment by feature-based probability density functions , 2016, Scientific Reports.

[46]  Ying Gao,et al.  Bioinformatics Applications Note Sequence Analysis Cd-hit Suite: a Web Server for Clustering and Comparing Biological Sequences , 2022 .

[47]  Vineet K. Sharma,et al.  IL17eScan: A Tool for the Identification of Peptides Inducing IL-17 Response , 2017, Front. Immunol..

[48]  Wei Chen,et al.  Sequence-based predictive modeling to identify cancerlectins , 2017, Oncotarget.

[49]  Sangdun Choi,et al.  In Silico Approach to Inhibition of Signaling Pathways of Toll-Like Receptors 2 and 4 by ST2L , 2011, PloS one.

[50]  D. Felsen,et al.  Modulating bladder neuro-inflammation: RDP58, a novel anti-inflammatory peptide, decreases inflammation and nerve growth factor production in experimental cystitis. , 2005, The Journal of urology.

[51]  Sangdun Choi,et al.  Molecular modeling‐based evaluation of dual function of IκBζ ankyrin repeat domain in toll‐like receptor signaling , 2011, Journal of molecular recognition : JMR.

[52]  Hiroyuki Kurata,et al.  Computational identification of protein S-sulfenylation sites by incorporating the multiple sequence features information. , 2017, Molecular bioSystems.

[53]  X. Chen,et al.  SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence , 2003, Nucleic Acids Res..

[54]  Vineet K. Sharma,et al.  Prediction of anti-inflammatory proteins/peptides: an insilico approach , 2016, Journal of Translational Medicine.

[55]  Jie Hou,et al.  DeepQA: improving the estimation of single protein model quality with deep belief networks , 2016, BMC Bioinformatics.

[56]  H. Patterson,et al.  Protein kinase inhibitors in the treatment of inflammatory and autoimmune diseases , 2014, Clinical and experimental immunology.

[57]  Jooyoung Lee,et al.  Structure-based protein folding type classification and folding rate prediction , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[58]  J. Cavaillon,et al.  Regulation by anti-inflammatory cytokines (IL-4, IL-10, IL-13, TGFβ)of interleukin-8 production by LPS- and/ or TNFα-activated human polymorphonuclear cells , 1996, Mediators of inflammation.

[59]  Gajendra P. S. Raghava,et al.  Analysis and prediction of antibacterial peptides , 2007, BMC Bioinformatics.

[60]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[61]  Hao Lin,et al.  Identifying Sigma70 Promoters with Novel Pseudo Nucleotide Composition , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[62]  Sandeep Kumar Dhanda,et al.  Prediction of IL4 Inducing Peptides , 2013, Clinical & developmental immunology.

[63]  Gajendra P. S. Raghava,et al.  Computer-aided designing of immunosuppressive peptides based on IL-10 inducing potential , 2017, Scientific Reports.

[64]  Manoj Kumar,et al.  AVPpred: collection and prediction of highly effective antiviral peptides , 2012, Nucleic Acids Res..

[65]  Ruiwen Zhang,et al.  Anti-Inflammatory Agents for Cancer Therapy. , 2009, Molecular and cellular pharmacology.

[66]  Khusru Asadullah,et al.  Novel immunotherapies for psoriasis. , 2002, Trends in immunology.

[67]  Sangdun Choi,et al.  Molecular modeling of the reductase domain to elucidate the reaction mechanism of reduction of peptidyl thioester into its corresponding alcohol in non-ribosomal peptide synthetases , 2010, BMC Structural Biology.

[68]  Ira Tabas,et al.  Anti-Inflammatory Therapy in Chronic Disease: Challenges and Opportunities , 2013, Science.

[69]  Ujjwal Maulik,et al.  Fuzzy clustering of physicochemical and biochemical properties of amino Acids , 2011, Amino Acids.

[70]  D. Selkoe,et al.  Nasal administration of amyloid‐β peptide decreases cerebral amyloid burden in a mouse model of Alzheimer's disease , 2000, Annals of neurology.

[71]  I. Muchnik,et al.  Prediction of protein folding class using global description of amino acid sequence. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[72]  Sangdun Choi,et al.  Roles of toll-like receptors in Cancer: A double-edged sword for defense and offense , 2012, Archives of Pharmacal Research.

[73]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[74]  K. Chou Some remarks on protein attribute prediction and pseudo amino acid composition , 2010, Journal of Theoretical Biology.

[75]  Kuo-Chen Chou,et al.  2L-piRNA: A Two-Layer Ensemble Classifier for Identifying Piwi-Interacting RNAs and Their Function , 2017, Molecular therapy. Nucleic acids.

[76]  Renzhi Cao,et al.  SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines , 2013, BMC Bioinformatics.

[77]  Mario Delgado,et al.  Vasoactive intestinal peptide prevents experimental arthritis by downregulating both autoimmune and inflammatory components of the disease , 2001, Nature Medicine.

[78]  Lucila Ohno-Machado,et al.  A Comparison of Machine Learning Methods for the Diagnosis of Pigmented Skin Lesions , 2001, J. Biomed. Informatics.

[79]  Hua Tang,et al.  IonchanPred 2.0: A Tool to Predict Ion Channels and Their Types , 2017, International journal of molecular sciences.

[80]  Ettore Novellino,et al.  Design and synthesis of melanocortin peptides with candidacidal and anti-TNF-alpha properties. , 2005, Journal of medicinal chemistry.

[81]  Minoru Kanehisa,et al.  AAindex: amino acid index database, progress report 2008 , 2007, Nucleic Acids Res..

[82]  Gwang Lee,et al.  PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine , 2018, Front. Microbiol..