iMethyl-PseAAC: Identification of Protein Methylation Sites via a Pseudo Amino Acid Composition Approach

Before becoming the native proteins during the biosynthesis, their polypeptide chains created by ribosome's translating mRNA will undergo a series of “product-forming” steps, such as cutting, folding, and posttranslational modification (PTM). Knowledge of PTMs in proteins is crucial for dynamic proteome analysis of various human diseases and epigenetic inheritance. One of the most important PTMs is the Arg- or Lys-methylation that occurs on arginine or lysine, respectively. Given a protein, which site of its Arg (or Lys) can be methylated, and which site cannot? This is the first important problem for understanding the methylation mechanism and drug development in depth. With the avalanche of protein sequences generated in the postgenomic age, its urgency has become self-evident. To address this problem, we proposed a new predictor, called iMethyl-PseAAC. In the prediction system, a peptide sample was formulated by a 346-dimensional vector, formed by incorporating its physicochemical, sequence evolution, biochemical, and structural disorder information into the general form of pseudo amino acid composition. It was observed by the rigorous jackknife test and independent dataset test that iMethyl-PseAAC was superior to any of the existing predictors in this area.

[1]  Sukanta Mondal,et al.  Chou's pseudo amino acid composition improves sequence-based antifreeze protein prediction. , 2014, Journal of theoretical biology.

[2]  Pufeng Du,et al.  PseAAC-General: Fast Building Various Modes of General Form of Chou’s Pseudo-Amino Acid Composition for Large-Scale Protein Datasets , 2014, International journal of molecular sciences.

[3]  Dimitris N. Georgiou,et al.  A Short Survey on Genetic Sequences, Chou’s Pseudo Amino Acid Composition and its Combination with Fuzzy Set Theory , 2013 .

[4]  K. Chou,et al.  iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. , 2013, Analytical biochemistry.

[5]  K. Chou,et al.  iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins , 2013, PeerJ.

[6]  K. Chou,et al.  iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types. , 2013, Analytical biochemistry.

[7]  Kuo-Chen Chou,et al.  Some remarks on predicting multi-label attributes in molecular biosystems. , 2013, Molecular bioSystems.

[8]  Jacques Lapointe,et al.  Theoretical and experimental biology in one—A symposium in honour of Professor Kuo-Chen Chou’s 50th anniversary and Professor Richard Giegé’s 40th anniversary of their scientific careers , 2013 .

[9]  S. Kung,et al.  GOASVM: a subcellular location predictor by incorporating term-frequency gene ontology into the general form of Chou's pseudo-amino acid composition. , 2013, Journal of theoretical biology.

[10]  Dong-Sheng Cao,et al.  propy: a tool to generate various modes of Chou's PseAAC , 2013, Bioinform..

[11]  K. Chou,et al.  iLoc-Animal: a multi-label learning classifier for predicting subcellular localization of animal proteins. , 2013, Molecular bioSystems.

[12]  Kuo-Chen Chou,et al.  A sequence-based approach for predicting protein disordered regions. , 2013, Protein and peptide letters.

[13]  Kuo-Bin Li,et al.  Predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of Chou's pseudo amino acid composition. , 2013, Journal of theoretical biology.

[14]  K. Chou,et al.  iSNO-PseAAC: Predict Cysteine S-Nitrosylation Sites in Proteins by Incorporating Position Specific Amino Acid Propensity into Pseudo Amino Acid Composition , 2013, PloS one.

[15]  Wei Chen,et al.  iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition , 2013, Nucleic acids research.

[16]  Mandana Behbahani,et al.  Predicting antibacterial peptides by the concept of Chou's pseudo-amino acid composition and machine learning methods. , 2012, Protein and peptide letters.

[17]  Hassan Mohabatkar,et al.  Prediction of allergenic proteins by means of the concept of Chou's pseudo amino acid composition and a machine learning approach. , 2012, Medicinal chemistry (Shariqah (United Arab Emirates)).

[18]  Sheng-Xiang Lin,et al.  Theoretical and experimental biology in one — , 2013 .

[19]  Suyu Mei,et al.  Predicting plant protein subcellular multi-localization by Chou's PseAAC formulation based multi-label homolog knowledge transfer learning. , 2012, Journal of theoretical biology.

[20]  Shu-Yun Huang,et al.  PMeS: Prediction of Methylation Sites Based on Enhanced Feature Encoding Scheme , 2012, PloS one.

[21]  Xin Wang,et al.  PseAAC-Builder: a cross-platform stand-alone program for generating various special Chou's pseudo-amino acid compositions. , 2012, Analytical biochemistry.

[22]  Dinesh Gupta,et al.  Identifying Bacterial Virulent Proteins by Fusing a Set of Classifiers Based on Variants of Chou's Pseudo Amino Acid Composition and on Evolutionary Information , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[23]  Kuo-Chen Chou,et al.  iNR-PhysChem: A Sequence-Based Predictor for Identifying Nuclear Receptors and Their Subfamilies via Physical-Chemical Property Matrix , 2012, PloS one.

[24]  K. Chou,et al.  iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites. , 2012, Molecular bioSystems.

[25]  Loris Nanni,et al.  Wavelet images and Chou’s pseudo amino acid composition for protein classification , 2011, Amino Acids.

[26]  Yu-Dong Cai,et al.  Prediction and analysis of protein methylarginine and methyllysine based on Multisequence features. , 2011, Biopolymers.

[27]  A. Esmaeili,et al.  Prediction of GABAA receptor proteins using the concept of Chou's pseudo-amino acid composition and support vector machine. , 2011, Journal of theoretical biology.

[28]  Loris Nanni,et al.  A new encoding technique for peptide classification , 2011, Expert Syst. Appl..

[29]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[30]  Kuo-Chen Chou,et al.  GPCR-2L: predicting G protein-coupled receptors and their types by hybridizing two different modes of pseudo amino acid compositions. , 2011, Molecular bioSystems.

[31]  P. Suganthan,et al.  AFP-Pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties. , 2011, Journal of theoretical biology.

[32]  Shuk-Mei Ho,et al.  Epigenetics meets endocrinology. , 2011, Journal of molecular endocrinology.

[33]  K. Chou Some remarks on protein attribute prediction and pseudo amino acid composition , 2010, Journal of Theoretical Biology.

[34]  H. Timmers,et al.  Histone lysine methylation and demethylation pathways in cancer. , 2011, Biochimica et biophysica acta.

[35]  M. Esmaeili,et al.  Using the concept of Chou's pseudo amino acid composition for risk type prediction of human papillomaviruses. , 2010, Journal of theoretical biology.

[36]  M. Meaney,et al.  Environmental regulation of the neural epigenome. , 2009, Epigenomics.

[37]  K. Chou,et al.  REVIEW : Recent advances in developing web-servers for predicting protein attributes , 2009 .

[38]  Jorng-Tzong Horng,et al.  Incorporating structural characteristics for identification of protein methylation sites , 2009, J. Comput. Chem..

[39]  Dong Xu,et al.  Computational Identification of Protein Methylation Sites through Bi-Profile Bayes Feature Extraction , 2009, PloS one.

[40]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[41]  Minoru Kanehisa,et al.  AAindex: amino acid index database, progress report 2008 , 2007, Nucleic Acids Res..

[42]  Shao-Wu Zhang,et al.  Using the concept of Chou’s pseudo amino acid composition to predict protein subcellular localization: an approach by incorporating evolutionary information and von Neumann entropies , 2008, Amino Acids.

[43]  K. Chou,et al.  Recent progress in protein subcellular location prediction. , 2007, Analytical biochemistry.

[44]  A. Suzuki,et al.  Citrullination by Peptidylarginine Deiminase in Rheumatoid Arthritis , 2007, Annals of the New York Academy of Sciences.

[45]  K. Chou,et al.  Euk-mPLoc: a fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites. , 2007, Journal of proteome research.

[46]  W. Paik,et al.  Historical review: the field of protein methylation. , 2007, Trends in biochemical sciences.

[47]  Zoran Obradovic,et al.  DisProt: the Database of Disordered Proteins , 2006, Nucleic Acids Res..

[48]  Jieping Zhao,et al.  Molecular architecture of the DNA-binding region and its relationship to classification of basic helix-loop-helix proteins. , 2006, Molecular biology and evolution.

[49]  Reinout Raijmakers,et al.  Increased Citrullination of Histone H3 in Multiple Sclerosis Brain and Animal Models of Demyelination: A Role for Tumor Necrosis Factor-Induced Peptidylarginine Deiminase 4 Translocation , 2006, The Journal of Neuroscience.

[50]  Brian K. Kennedy,et al.  Sirtuins in Aging and Age-Related Disease , 2006, Cell.

[51]  Kuo-Chen Chou,et al.  Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-Nearest Neighbor classifiers. , 2006, Journal of proteome research.

[52]  Yu Xue,et al.  MeMo: a web tool for prediction of protein methylation modifications , 2006, Nucleic Acids Res..

[53]  I. Weaver,et al.  Maternal Care Associated with Methylation of the Estrogen Receptor-α1b Promoter and Estrogen Receptor-α Expression in the Medial Preoptic Area of Female Offspring , 2006 .

[54]  H. Katus,et al.  Expression of nitric oxide related enzymes in coronary heart disease , 2006, Basic Research in Cardiology.

[55]  Kuo-Chen Chou,et al.  Prediction of protease types in a hybridization space. , 2006, Biochemical and biophysical research communications.

[56]  Zoran Obradovic,et al.  Length-dependent prediction of protein intrinsic disorder , 2006, BMC Bioinformatics.

[57]  F. Champagne,et al.  How social experiences influence the brain , 2005, Current Opinion in Neurobiology.

[58]  Toshihide Nishimura,et al.  Large‐scale analysis of the human ubiquitin‐related proteome , 2005, Proteomics.

[59]  Vladimir Brusic,et al.  Neural Models for Predicting Viral Vaccine Targets , 2005, J. Bioinform. Comput. Biol..

[60]  Kuo-Chen Chou,et al.  Boosting classifier for predicting protein domain structural class. , 2005, Biochemical and biophysical research communications.

[61]  Z. Huang,et al.  Using pseudo amino acid composition to predict protein subcellular location: Approached with Lyapunov index, Bessel function, and Chebyshev filter , 2005, Amino Acids.

[62]  W. Atchley,et al.  Solving the protein sequence metric problem. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[63]  Mark T Bedford,et al.  Arginine methylation an emerging regulator of protein function. , 2005, Molecular cell.

[64]  Predrag Radivojac,et al.  Intrinsic Disorder and Prote in Modifications: Building an SVM Predictor for Methylation , 2005, 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[65]  Kuo-Chen Chou,et al.  Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes , 2005, Bioinform..

[66]  M. Mann,et al.  Identifying and quantifying in vivo methylation sites by heavy methyl SILAC , 2004, Nature Methods.

[67]  K. Chou Structural bioinformatics and its impact to biomedical science. , 2004, Current medicinal chemistry.

[68]  François-Michel Boisvert,et al.  A Proteomic Analysis of Arginine-methylated Protein Complexes* , 2003, Molecular & Cellular Proteomics.

[69]  Yingdong Zhao,et al.  Application of support vector machines for T-cell epitopes prediction , 2003, Bioinform..

[70]  Steven P Gygi,et al.  A proteomics approach to understanding protein ubiquitination , 2003, Nature Biotechnology.

[71]  K. Chou,et al.  Support vector machines for predicting membrane protein types by using functional domain composition. , 2003, Biophysical journal.

[72]  J. Yates,et al.  A method for the comprehensive proteomic analysis of membrane proteins , 2003, Nature Biotechnology.

[73]  K. Chou,et al.  Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular Location* , 2002, The Journal of Biological Chemistry.

[74]  P. Tompa Intrinsically unstructured proteins. , 2002, Trends in biochemical sciences.

[75]  B. Rost,et al.  Loopy proteins appear conserved in evolution. , 2002, Journal of molecular biology.

[76]  A Keith Dunker,et al.  Intrinsic disorder and protein function. , 2002, Biochemistry.

[77]  K. Chou Prediction of signal peptides using scaled window , 2001, Peptides.

[78]  Thomas L. Madden,et al.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. , 2001, Nucleic acids research.

[79]  G P Zhou,et al.  Some insights into protein structural class prediction , 2001, Proteins.

[80]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[81]  K. Chou Using subsite coupling to predict signal peptides. , 2001, Protein engineering.

[82]  K. Chou,et al.  Prediction of protein signal sequences and their cleavage sites , 2001, Proteins.

[83]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[84]  H. Dyson,et al.  Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. , 1999, Journal of molecular biology.

[85]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[86]  Hiroyuki Ogata,et al.  AAindex: Amino Acid Index Database , 1999, Nucleic Acids Res..

[87]  K. Chou,et al.  Using discriminant function for prediction of subcellular location of prokaryotic proteins. , 1998, Biochemical and biophysical research communications.

[88]  D. Williamson,et al.  Relationship of childhood abuse and household dysfunction to many of the leading causes of death in adults. The Adverse Childhood Experiences (ACE) Study. , 1998, American journal of preventive medicine.

[89]  K. Chou Prediction of human immunodeficiency virus protease cleavage sites in proteins. , 1996, Analytical biochemistry.

[90]  K. Chou,et al.  Neural network prediction of the HIV-1 protease cleavage sites. , 1995, Journal of theoretical biology.

[91]  K. Chou,et al.  A sequence‐coupled vector‐projection model for predicting the specificity of GalNAc‐transferase , 1995, Protein science : a publication of the Protein Society.

[92]  Kuo-Chen Chou,et al.  The convergence‐divergence duality in lectin domains of selectin family and its implications , 1995, FEBS letters.

[93]  K. Chou,et al.  Does the folding type of a protein depend on its amino acid composition? , 1995, FEBS letters.

[94]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[95]  C. Zhang,et al.  Predicting protein folding types by distance functions that make allowances for amino acid interactions. , 1994, The Journal of biological chemistry.

[96]  K Nishikawa,et al.  Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies. , 1994, Journal of molecular biology.

[97]  J. Chou,et al.  A formulation for correlating properties of peptides and its application to predicting human immunodeficiency virus protease‐cleavable sites in proteins , 1993, Biopolymers.

[98]  K. Chou,et al.  A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins. , 1993, The Journal of biological chemistry.

[99]  K. Chou,et al.  A correlation-coefficient method to predicting protein-structural classes from amino acid compositions. , 1992, European journal of biochemistry.

[100]  K. Chou,et al.  An optimization approach to predicting protein structural class from amino acid composition , 1992, Protein science : a publication of the Protein Society.

[101]  J. Deng,et al.  Introduction to Grey system theory , 1989 .