Computational prediction of species‐specific malonylation sites via enhanced characteristic strategy

Motivation: Protein malonylation is a novel post‐translational modification (PTM) which orchestrates a variety of biological processes. Annotation of malonylation in proteomics is the first‐crucial step to decipher its physiological roles which are implicated in the pathological processes. Comparing with the expensive and laborious experimental research, computational prediction can provide an accurate and effective approach to the identification of many types of PTMs sites. However, there is still no online predictor for lysine malonylation. Results: By searching from literature and database, a well‐prepared up‐to‐data benchmark datasets were collected in multiple organisms. Data analyses demonstrated that different organisms were preferentially involved in different biological processes and pathways. Meanwhile, unique sequence preferences were observed for each organism. Thus, a novel malonylation site online prediction tool, called MaloPred, which can predict malonylation for three species, was developed by integrating various informative features and via an enhanced feature strategy. On the independent test datasets, AUC (area under the receiver operating characteristic curves) scores are obtained as 0.755, 0.827 and 0.871 for Escherichia coli (E.coli), Mus musculus (M.musculus) and Homo sapiens (H.sapiens), respectively. The satisfying results suggest that MaloPred can provide more instructive guidance for further experimental investigation of protein malonylation. Availability and Implementation: http://bioinfo.ncu.edu.cn/MaloPred.aspx. Contact: jdqiu@ncu.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Yu Xue,et al.  MeMo: a web tool for prediction of protein methylation modifications , 2006, Nucleic Acids Res..

[2]  Ke Chen,et al.  Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs , 2007, BMC Structural Biology.

[3]  Hening Lin,et al.  Protein lysine acylation and cysteine succination by intermediates of energy metabolism. , 2012, ACS chemical biology.

[4]  Peng Xue,et al.  Lysine Malonylation Is Elevated in Type 2 Diabetic Mouse Models and Enriched in Metabolic Associated Proteins* , 2014, Molecular & Cellular Proteomics.

[5]  Ping Liu,et al.  Global Profiling of Protein Lysine Malonylation in Escherichia coli Reveals Its Role in Energy Metabolism. , 2016, Journal of proteome research.

[6]  Matthew J. Rardin,et al.  SIRT5 Regulates both Cytosolic and Mitochondrial Protein Malonylation with Glycolysis as a Major Target. , 2015, Molecular cell.

[7]  Geoffrey I. Webb,et al.  GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome , 2015, Bioinform..

[8]  Anushya Muruganujan,et al.  Large-scale gene function analysis with the PANTHER classification system , 2013, Nature Protocols.

[9]  Yong-Zi Chen,et al.  GANNPhos: a new phosphorylation site predictor based on a genetic algorithm integrated neural network. , 2007, Protein engineering, design & selection : PEDS.

[10]  Songbo Tan,et al.  An effective refinement strategy for KNN text classifier , 2006, Expert Syst. Appl..

[11]  Shao-Ping Shi,et al.  SuccFind: a novel succinylation sites online prediction tool via enhanced characteristic strategy , 2015, Bioinform..

[12]  George M. Church,et al.  Predicting Protein Post-translational Modifications Using Meta-analysis of Proteome Scale Data Sets*S , 2009, Molecular & Cellular Proteomics.

[13]  Ming Lu,et al.  ASEB: a web server for KAT-specific acetylation site prediction , 2012, Nucleic Acids Res..

[14]  Ronald J A Wanders,et al.  Proteomic and Biochemical Studies of Lysine Malonylation Suggest Its Malonic Aciduria-associated Regulatory Role in Mitochondrial Function and Fatty Acid Oxidation* , 2015, Molecular & Cellular Proteomics.

[15]  Jian-Ding Qiu,et al.  Predicting subcellular location of apoptosis proteins based on wavelet transform and support vector machine , 2010, Amino Acids.

[16]  N. Blom,et al.  Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. , 1999, Journal of molecular biology.

[17]  Yi Zhang,et al.  The First Identification of Lysine Malonylation Substrates and Its Regulatory Enzyme* , 2011, Molecular & Cellular Proteomics.

[18]  Xiang David Li,et al.  A chemical probe for lysine malonylation. , 2013, Angewandte Chemie.

[19]  Eric Verdin,et al.  Mitochondrial sirtuins: regulators of protein acylation and metabolism , 2012, Trends in Endocrinology & Metabolism.

[20]  Qi Zhao,et al.  GPS-SUMO: a tool for the prediction of sumoylation sites and SUMO-interaction motifs , 2014, Nucleic Acids Res..

[21]  Eran Segal,et al.  Proteome-wide prediction of acetylation substrates , 2009, Proceedings of the National Academy of Sciences.

[22]  Yu Xue,et al.  CPLM: a database of protein lysine modifications , 2013, Nucleic Acids Res..

[23]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[24]  Shu-Yun Huang,et al.  PMeS: Prediction of Methylation Sites Based on Enhanced Feature Encoding Scheme , 2012, PloS one.

[25]  Bermseok Oh,et al.  Prediction of phosphorylation sites using SVMs , 2004, Bioinform..

[26]  Yinglin Wang,et al.  Predicting the protein SUMO modification sites based on Properties Sequential Forward Selection (PSFS). , 2007, Biochemical and biophysical research communications.

[27]  Ming Lu,et al.  Systematic identification of Class I HDAC substrates , 2014, Briefings Bioinform..

[28]  Dong Xu,et al.  Musite, a Tool for Global Prediction of General and Kinase-specific Phosphorylation Sites* , 2010, Molecular & Cellular Proteomics.

[29]  Yingming Zhao,et al.  Metabolic Regulation by Lysine Malonylation, Succinylation, and Glutarylation* , 2015, Molecular & Cellular Proteomics.

[30]  Xiang Chen,et al.  Incorporating key position and amino acid residue features to identify general and species-specific Ubiquitin conjugation sites , 2013, Bioinform..

[31]  Ying Zhang,et al.  Computational prediction of methylation types of covalently modified lysine and arginine residues in proteins , 2016, Briefings Bioinform..

[32]  Ying Gao,et al.  Bioinformatics Applications Note Sequence Analysis Cd-hit Suite: a Web Server for Clustering and Comparing Biological Sequences , 2022 .

[33]  William Stafford Noble,et al.  Support vector machine , 2013 .

[34]  Shao-Ping Shi,et al.  PSEA: Kinase-specific prediction and analysis of human phosphorylation substrates , 2014, Scientific Reports.

[35]  Shao-Ping Shi,et al.  PLMLA: prediction of lysine methylation and lysine acetylation by combining multiple features. , 2012, Molecular bioSystems.

[36]  Shu-Yun Huang,et al.  Position-Specific Analysis and Prediction for Protein Lysine Acetylation Based on Multiple Features , 2012, PLoS ONE.

[37]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[38]  Xiaolong Wang,et al.  Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection , 2013, Bioinform..

[39]  Zhen-Hui Zhang,et al.  A novel method for apoptosis protein subcellular localization prediction combining encoding based on grouped weight and support vector machine , 2006, FEBS letters.

[40]  Jianding Qiu,et al.  Systematic Analysis and Prediction of Pupylation Sites in Prokaryotic Proteins , 2013, PloS one.

[41]  J. Boeke,et al.  Lysine Succinylation and Lysine Malonylation in Histones* , 2012, Molecular & Cellular Proteomics.

[42]  Shao-Ping Shi,et al.  PredSulSite: prediction of protein tyrosine sulfation sites with multiple features and analysis. , 2012, Analytical biochemistry.

[43]  Kuo-Chen Chou,et al.  GPCR-2L: predicting G protein-coupled receptors and their types by hybridizing two different modes of pseudo amino acid compositions. , 2011, Molecular bioSystems.