iSNO-PseAAC: Predict Cysteine S-Nitrosylation Sites in Proteins by Incorporating Position Specific Amino Acid Propensity into Pseudo Amino Acid Composition

Posttranslational modifications (PTMs) of proteins are responsible for sensing and transducing signals to regulate various cellular functions and signaling events. S-nitrosylation (SNO) is one of the most important and universal PTMs. With the avalanche of protein sequences generated in the post-genomic age, it is highly desired to develop computational methods for timely identifying the exact SNO sites in proteins because this kind of information is very useful for both basic research and drug development. Here, a new predictor, called iSNO-PseAAC, was developed for identifying the SNO sites in proteins by incorporating the position-specific amino acid propensity (PSAAP) into the general form of pseudo amino acid composition (PseAAC). The predictor was implemented using the conditional random field (CRF) algorithm. As a demonstration, a benchmark dataset was constructed that contains 731 SNO sites and 810 non-SNO sites. To reduce the homology bias, none of these sites were derived from the proteins that had pairwise sequence identity to any other. It was observed that the overall cross-validation success rate achieved by iSNO-PseAAC in identifying nitrosylated proteins on an independent dataset was over 90%, indicating that the new predictor is quite promising. Furthermore, a user-friendly web-server for iSNO-PseAAC was established at http://app.aporc.org/iSNO-PseAAC/, by which users can easily obtain the desired results without the need to follow the mathematical equations involved during the process of developing the prediction method. It is anticipated that iSNO-PseAAC may become a useful high throughput tool for identifying the SNO sites, or at the very least play a complementary role to the existing methods in this area.

[1]  Wei Chen,et al.  iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition , 2013, Nucleic acids research.

[2]  Hassan Mohabatkar,et al.  Prediction of allergenic proteins by means of the concept of Chou's pseudo amino acid composition and a machine learning approach. , 2012, Medicinal chemistry (Shariqah (United Arab Emirates)).

[3]  K. Chou,et al.  Predicting Secretory Proteins of Malaria Parasite by Incorporating Sequence Evolution Information into Pseudo Amino Acid Composition via Grey System Model , 2012, PloS one.

[4]  Samad Jahandideh,et al.  Comprehensive comparative analysis and identification of RNA-binding protein domains: multi-class classification and feature selection. , 2012, Journal of theoretical biology.

[5]  Shao-Ping Shi,et al.  Identifying protein quaternary structural attributes by incorporating physicochemical properties into the general form of Chou's PseAAC via discrete wavelet transform. , 2012, Molecular bioSystems.

[6]  Wei Chen,et al.  iNuc-PhysChem: A Sequence-Based Predictor for Identifying Nucleosomes via Physicochemical Properties , 2012, PloS one.

[7]  Xiao-hui Niu,et al.  Predicting protein solubility by the general form of Chou's pseudo amino acid composition: approached from chaos game representation and fractal dimension. , 2012, Protein and peptide letters.

[8]  Zia-ur-Rehman,et al.  Identifying GPCRs and their types with Chou's pseudo amino acid composition: an approach from multi-scale energy representation and position specific scoring matrix. , 2012, Protein and peptide letters.

[9]  Xin Wang,et al.  PseAAC-Builder: a cross-platform stand-alone program for generating various special Chou's pseudo-amino acid compositions. , 2012, Analytical biochemistry.

[10]  Maqsood Hayat,et al.  Discriminating outer membrane proteins with Fuzzy K-nearest Neighbor algorithms based on the general form of Chou's PseAAC. , 2012, Protein and peptide letters.

[11]  Dinesh Gupta,et al.  Identifying Bacterial Virulent Proteins by Fusing a Set of Classifiers Based on Variants of Chou's Pseudo Amino Acid Composition and on Evolutionary Information , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  Kuo-Chen Chou,et al.  iNR-PhysChem: A Sequence-Based Predictor for Identifying Nuclear Receptors and Their Subfamilies via Physical-Chemical Property Matrix , 2012, PloS one.

[13]  Kuo-Chen Chou,et al.  Predict and analyze S-nitrosylation modification sites with the mRMR and IFS approaches. , 2012, Journal of proteomics.

[14]  K. Chou,et al.  iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites. , 2012, Molecular bioSystems.

[15]  Asifullah Khan,et al.  MemHyb: predicting membrane protein types by hybridizing SAAC and PSSM. , 2012, Journal of theoretical biology.

[16]  E Aranda,et al.  Nitric oxide and cancer: the emerging role of S-nitrosylation. , 2012, Current molecular medicine.

[17]  Loris Nanni,et al.  Wavelet images and Chou’s pseudo amino acid composition for protein classification , 2011, Amino Acids.

[18]  H. Mohabatkar,et al.  Prediction of metalloproteinase family based on the concept of Chou’s pseudo amino acid composition using a machine learning approach , 2011, Journal of Structural and Functional Genomics.

[19]  Xingyu Wang,et al.  The human cytomegalovirus is associated with ischemic stroke and cerebral hemorrhage in a Chinese population , 2011 .

[20]  K. Chou,et al.  iDNA-Prot: Identification of DNA Binding Proteins Using Random Forest with Grey Model , 2011, PloS one.

[21]  A. Esmaeili,et al.  Prediction of GABAA receptor proteins using the concept of Chou's pseudo-amino acid composition and support vector machine. , 2011, Journal of theoretical biology.

[22]  Jianxiu Guo,et al.  Predicting protein folding rates using the concept of Chou's pseudo amino acid composition , 2011, Journal of computational chemistry.

[23]  Yuan-Hai Shao,et al.  An efficient support vector machine approach for identifying protein S-nitrosylation sites. , 2011, Protein and peptide letters.

[24]  J. Stamler,et al.  The SNO-proteome: causation and classifications. , 2011, Current opinion in chemical biology.

[25]  Dongsheng Zou,et al.  Supersecondary structure prediction using Chou's pseudo amino acid composition , 2011, J. Comput. Chem..

[26]  K. Chou Some remarks on protein attribute prediction and pseudo amino acid composition , 2010, Journal of Theoretical Biology.

[27]  Ganapati Panda,et al.  A novel feature representation method based on Chou's pseudo amino acid composition for protein structural class prediction , 2010, Comput. Biol. Chem..

[28]  Wei-Chi Ku,et al.  S-alkylating labeling strategy for site-specific identification of the s-nitrosoproteome. , 2010, Journal of proteome research.

[29]  K. Chou,et al.  Cell-PLoc 2.0: an improved package of web-servers for predicting subcellular localization of proteins in various organisms , 2010 .

[30]  Hassan Mohabatkar,et al.  Prediction of cyclin proteins using Chou's pseudo amino acid composition. , 2010, Protein and peptide letters.

[31]  Zexian Liu,et al.  GPS-SNO: Computational Prediction of Protein S-Nitrosylation Sites with a Modified GPS Algorithm , 2010, PloS one.

[32]  M. Esmaeili,et al.  Using the concept of Chou's pseudo amino acid composition for risk type prediction of human papillomaviruses. , 2010, Journal of theoretical biology.

[33]  K. Chou Pseudo Amino Acid Composition and its Applications in Bioinformatics, Proteomics and System Biology , 2009 .

[34]  Jorng-Tzong Horng,et al.  Incorporating support vector machine for identifying protein tyrosine sulfation sites , 2009, J. Comput. Chem..

[35]  J. Stamler,et al.  Protein S-nitrosylation in health and disease: a current perspective. , 2009, Trends in molecular medicine.

[36]  K. Lindpaintner,et al.  The 1425G/A SNP in PRKCH Is Associated With Ischemic Stroke and Cerebral Hemorrhage in a Chinese Population , 2009, Stroke.

[37]  Leonardo Nogueira,et al.  Proteomic analysis of S-nitrosylation and denitrosylation by resin-assisted capture , 2009, Nature Biotechnology.

[38]  A. Godzik,et al.  S-Nitrosylation of Drp1 Mediates β-Amyloid-Related Mitochondrial Fission and Neuronal Injury , 2009, Science.

[39]  J. Troncoso,et al.  S-nitrosylation of XIAP compromises neuronal survival in Parkinson's disease , 2009, Proceedings of the National Academy of Sciences.

[40]  Dong Xu,et al.  Computational Identification of Protein Methylation Sites through Bi-Profile Bayes Feature Extraction , 2009, PloS one.

[41]  J. Nieto,et al.  Use of fuzzy clustering technique and matrices to classify amino acids and its impact to Chou's pseudo amino acid composition. , 2009, Journal of theoretical biology.

[42]  Antonella Riccio,et al.  S-nitrosylation of histone deacetylase 2 induces chromatin remodelling in neurons , 2008, Nature.

[43]  Shao-Wu Zhang,et al.  Using the concept of Chou’s pseudo amino acid composition to predict protein subcellular localization: an approach by incorporating evolutionary information and von Neumann entropies , 2008, Amino Acids.

[44]  Zhanchao Li,et al.  Using Chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. , 2007, Journal of theoretical biology.

[45]  J. Galagan,et al.  Conrad: gene prediction using conditional random fields. , 2007, Genome research.

[46]  Yong-Zi Chen,et al.  GANNPhos: a new phosphorylation site predictor based on a genetic algorithm integrated neural network. , 2007, Protein engineering, design & selection : PEDS.

[47]  P. C. Wille,et al.  Unbiased identification of cysteine S-nitrosylation sites on proteins , 2007, Nature Protocols.

[48]  Hsien-Da Huang,et al.  KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns , 2007, Nucleic Acids Res..

[49]  Fang Li,et al.  Regulation of HIF-1α Stability through S-nitrosylation , 2007 .

[50]  M. Dewhirst,et al.  Regulation of HIF-1alpha stability through S-nitrosylation. , 2007, Molecular cell.

[51]  Takashi Uehara,et al.  S-Nitrosylated protein-disulphide isomerase links protein misfolding to neurodegeneration , 2006, Nature.

[52]  Daniel C Liebler,et al.  Identification of S-nitrosylation motifs by site-specific mapping of the S-nitrosocysteine proteome in human vascular smooth muscle cells. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[53]  Masaaki Matsuoka,et al.  S-nitrosothiol depletion in amyotrophic lateral sclerosis , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Fabien Campagne,et al.  SNOSID, a proteomic method for identification of cysteine S-nitrosylation sites in complex protein mixtures. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[55]  Kuo-Chen Chou,et al.  Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes , 2005, Bioinform..

[56]  Bermseok Oh,et al.  Prediction of phosphorylation sites using SVMs , 2004, Bioinform..

[57]  Takashi Uehara,et al.  Nitrosative stress linked to sporadic Parkinson's disease: S-nitrosylation of parkin regulates its E3 ubiquitin ligase activity. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[58]  M. Mann,et al.  Proteomic analysis of post-translational modifications , 2003, Nature Biotechnology.

[59]  K. Chou Prediction of signal peptides using scaled window , 2001, Peptides.

[60]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[61]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[62]  Paul Tempst,et al.  Protein S-nitrosylation: a physiological signal for neuronal nitric oxide , 2001, Nature Cell Biology.

[63]  K. Chou Using subsite coupling to predict signal peptides. , 2001, Protein engineering.

[64]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001 .

[65]  K. Chou,et al.  Prediction of protein signal sequences and their cleavage sites , 2001, Proteins.

[66]  S. Suhai Theoretical and Computational Methods in Genome Research , 2012, Springer US.

[67]  Stephen F. Altschul,et al.  Evaluating the Statistical Significance of Multiple Distinct Local Alignments , 1997 .

[68]  K. Chou,et al.  A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins. , 1993, The Journal of biological chemistry.

[69]  John C. Wootton,et al.  Statistics of Local Complexity in Amino Acid Sequences and Sequence Databases , 1993, Comput. Chem..

[70]  K Nishikawa,et al.  The folding type of a protein is relevant to the amino acid composition. , 1986, Journal of biochemistry.