PalmPred: An SVM Based Palmitoylation Prediction Method Using Sequence Profile Information

Protein palmitoylation is the covalent attachment of the 16-carbon fatty acid palmitate to a cysteine residue. It is the most common acylation of protein and occurs only in eukaryotes. Palmitoylation plays an important role in the regulation of protein subcellular localization, stability, translocation to lipid rafts and many other protein functions. Hence, the accurate prediction of palmitoylation site(s) can help in understanding the molecular mechanism of palmitoylation and also in designing various related experiments. Here we present a novel in silico predictor called ‘PalmPred’ to identify palmitoylation sites from protein sequence information using a support vector machine model. The best performance of PalmPred was obtained by incorporating sequence conservation features of peptide of window size 11 using a leave-one-out approach. It helped in achieving an accuracy of 91.98%, sensitivity of 79.23%, specificity of 94.30%, and Matthews Correlation Coefficient of 0.71. PalmPred outperformed existing palmitoylation site prediction methods – IFS-Palm and WAP-Palm on an independent dataset. Based on these measures it can be anticipated that PalmPred will be helpful in identifying candidate palmitoylation sites. All the source datasets, standalone and web-server are available at http://14.139.227.92/mkumar/palmpred/.

[1]  M. Fukata,et al.  In Silico Screening for Palmitoyl Substrates Reveals a Role for DHHC1/3/10 (zDHHC1/3/11)-mediated Neurochondrin Palmitoylation in Its Targeting to Rab5-positive Endosomes* , 2013, The Journal of Biological Chemistry.

[2]  M. Marsh,et al.  The on-off story of protein palmitoylation. , 2003, Trends in cell biology.

[3]  M. Linder,et al.  Identification of a Novel Prenyl and Palmitoyl Modification at the CaaX Motif of Cdc42 That Regulates RhoGDI Binding , 2013, Molecular and Cellular Biology.

[4]  S. Hua,et al.  A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. , 2001, Journal of molecular biology.

[5]  G. Gould,et al.  The SNARE Proteins SNAP-25 and SNAP-23 Display Different Affinities for Lipid Rafts in PC12 Cells , 2005, Journal of Biological Chemistry.

[6]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[7]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[8]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[9]  Yu-Dong Cai,et al.  Prediction and Analysis of Post-Translational Pyruvoyl Residue Modification Sites from Internal Serines in Proteins , 2013, PloS one.

[10]  K. Chou Some remarks on protein attribute prediction and pseudo amino acid composition , 2010, Journal of Theoretical Biology.

[11]  Gajendra P S Raghava,et al.  Prediction of Mitochondrial Proteins Using Support Vector Machine and Hidden Markov Model* , 2006, Journal of Biological Chemistry.

[12]  J. Yates,et al.  Global Analysis of Protein Palmitoylation in Yeast , 2006, Cell.

[13]  K. Chou,et al.  Recent progress in protein subcellular location prediction. , 2007, Analytical biochemistry.

[14]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[15]  S. Clarke,et al.  In vivo differential prenylation of retinal cyclic GMP phosphodiesterase catalytic subunits. , 1992, Journal of Biological Chemistry.

[16]  Ling-Yun Wu,et al.  Prediction of palmitoylation sites using the composition of k-spaced amino acid pairs. , 2009, Protein engineering, design & selection : PEDS.

[17]  Adam Godzik,et al.  Tolerating some redundancy significantly speeds up clustering of large protein databases , 2002, Bioinform..

[18]  W. Wong,et al.  Differential Recruitment of Kv1.4 and Kv4.2 to Lipid Rafts by PSD-95* , 2004, Journal of Biological Chemistry.

[19]  Christine A. Orengo,et al.  Inferring Function Using Patterns of Native Disorder in Proteins , 2007, PLoS Comput. Biol..

[20]  Kuo-Chen Chou,et al.  Predicting membrane protein type by functional domain composition and pseudo-amino acid composition. , 2006, Journal of theoretical biology.

[21]  L. Iakoucheva,et al.  The importance of intrinsic disorder for protein phosphorylation. , 2004, Nucleic acids research.

[22]  Dong Xu,et al.  Correlation Between Posttranslational Modification and Intrinsic Disorder in Protein , 2011, Pacific Symposium on Biocomputing.

[23]  Yu Xue,et al.  CSS-Palm: palmitoylation site prediction with a clustering and scoring strategy (CSS) , 2006, Bioinform..

[24]  K. Chou,et al.  iSNO-PseAAC: Predict Cysteine S-Nitrosylation Sites in Proteins by Incorporating Position Specific Amino Acid Propensity into Pseudo Amino Acid Composition , 2013, PloS one.

[25]  J. Greaves,et al.  The intracellular dynamic of protein palmitoylation , 2010, The Journal of cell biology.

[26]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[27]  Kuo-Chen Chou,et al.  Predicting protein structural class by functional domain composition. , 2004, Biochemical and biophysical research communications.

[28]  Yu Xue,et al.  NBA-Palm: prediction of palmitoylation site implemented in Naïve Bayes algorithm , 2006, BMC Bioinformatics.

[29]  M. Resh Palmitoylation of Ligands, Receptors, and Intracellular Signaling Molecules , 2006, Science's STKE.

[30]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[31]  L. Dietrich,et al.  On the mechanism of protein palmitoylation , 2004, EMBO reports.

[32]  Bermseok Oh,et al.  Prediction of phosphorylation sites using SVMs , 2004, Bioinform..

[33]  P. Radivojac,et al.  PROTEINS: Structure, Function, and Bioinformatics Suppl 7:176–182 (2005) Exploiting Heterogeneous Sequence Properties Improves Prediction of Protein Disorder , 2022 .

[34]  H. Mei,et al.  The VHSE-Based Prediction of Proteasomal Cleavage Sites , 2013, PloS one.

[35]  Guo-Ping Zhou,et al.  Subcellular location prediction of apoptosis proteins , 2002, Proteins.

[36]  Shao-Ping Shi,et al.  The prediction of palmitoylation site locations using a multiple feature extraction method. , 2013, Journal of molecular graphics & modelling.

[37]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[38]  K. Chou,et al.  Prediction and analysis of protein palmitoylation sites. , 2011, Biochimie.

[39]  G P Zhou,et al.  Some insights into protein structural class prediction , 2001, Proteins.

[40]  Gajendra P. S. Raghava,et al.  Identification of DNA-binding proteins using support vector machines and evolutionary profiles , 2007, BMC Bioinformatics.

[41]  Gajendra P.S. Raghava,et al.  Prediction of RNA binding sites in a protein using SVM and PSSM profile , 2008, Proteins.

[42]  Adam Godzik,et al.  Clustering of highly homologous sequences to reduce the size of large protein databases , 2001, Bioinform..

[43]  Changjiang Jin,et al.  CSS-Palm 2.0: an updated software for palmitoylation sites prediction. , 2008, Protein engineering, design & selection : PEDS.

[44]  R. Deschenes,et al.  New insights into the mechanisms of protein palmitoylation. , 2003, Biochemistry.

[45]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[46]  S. Sebti,et al.  Palmitoylated Cysteine 192 Is Required for RhoB Tumor-suppressive and Apoptotic Activities* , 2005, Journal of Biological Chemistry.

[47]  A. El-Husseini,et al.  Modulation of neuronal protein trafficking and function by palmitoylation , 2005, Current Opinion in Neurobiology.

[48]  Z. Huang,et al.  Using pseudo amino acid composition to predict protein subcellular location: Approached with Lyapunov index, Bessel function, and Chebyshev filter , 2005, Amino Acids.

[49]  Shandar Ahmad,et al.  PSSM-based prediction of DNA binding sites in proteins , 2005, BMC Bioinformatics.

[50]  Zoran Obradovic,et al.  Length-dependent prediction of protein intrinsic disorder , 2006, BMC Bioinformatics.

[51]  T. Gibson,et al.  A careful disorderliness in the proteome: Sites for interaction and targets for future therapies , 2008, FEBS letters.