Computational Identification of Protein Pupylation Sites by Using Profile-Based Composition of k-Spaced Amino Acid Pairs

Prokaryotic proteins are regulated by pupylation, a type of post-translational modification that contributes to cellular function in bacterial organisms. In pupylation process, the prokaryotic ubiquitin-like protein (Pup) tagging is functionally analogous to ubiquitination in order to tag target proteins for proteasomal degradation. To date, several experimental methods have been developed to identify pupylated proteins and their pupylation sites, but these experimental methods are generally laborious and costly. Therefore, computational methods that can accurately predict potential pupylation sites based on protein sequence information are highly desirable. In this paper, a novel predictor termed as pbPUP has been developed for accurate prediction of pupylation sites. In particular, a sophisticated sequence encoding scheme [i.e. the profile-based composition of k-spaced amino acid pairs (pbCKSAAP)] is used to represent the sequence patterns and evolutionary information of the sequence fragments surrounding pupylation sites. Then, a Support Vector Machine (SVM) classifier is trained using the pbCKSAAP encoding scheme. The final pbPUP predictor achieves an AUC value of 0.849 in10-fold cross-validation tests and outperforms other existing predictors on a comprehensive independent test dataset. The proposed method is anticipated to be a helpful computational resource for the prediction of pupylation sites. The web server and curated datasets in this study are freely available at http://protein.cau.edu.cn/pbPUP/.

[1]  Jiangning Song,et al.  Towards more accurate prediction of ubiquitination sites: a comprehensive review of current methods, tools and features , 2015, Briefings Bioinform..

[2]  Ziv Roth,et al.  Survival of mycobacteria depends on proteasome‐mediated amino acid recycling under nutrient limitation , 2014, The EMBO journal.

[3]  Geoffrey I. Webb,et al.  Accurate in silico identification of species-specific acetylation sites by integrating protein sequence-derived and functional features , 2014, Scientific Reports.

[4]  L. Eggeling,et al.  Pupylated proteins in Corynebacterium glutamicum revealed by MudPIT analysis , 2014, Proteomics.

[5]  Jiangning Song,et al.  Structural Propensities of Human Ubiquitination Sites: Accessibility, Centrality and Local Conformation , 2013, PloS one.

[6]  C. Tung Prediction of pupylation sites using the composition of k-spaced amino acid pairs. , 2013, Journal of theoretical biology.

[7]  Jianding Qiu,et al.  Systematic Analysis and Prediction of Pupylation Sites in Prokaryotic Proteins , 2013, PloS one.

[8]  Minghao Yin,et al.  Position-Specific Analysis and Prediction of Protein Pupylation Sites Based on Multiple Features , 2013, BioMed research international.

[9]  Jiangning Song,et al.  hCKSAAP_UbSite: improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties. , 2013, Biochimica et biophysica acta.

[10]  Wenyi Zhang,et al.  Prediction of methylation sites using the composition of K-spaced amino acid pairs. , 2013, Protein and peptide letters.

[11]  Ziding Zhang,et al.  Using Weakly Conserved Motifs Hidden in Secretion Signals to Identify Type-III Effectors from Bacterial Pathogen Genomes , 2013, PloS one.

[12]  J. Barandun,et al.  The pupylation pathway and its role in mycobacteria , 2012, BMC Biology.

[13]  N. Tamura,et al.  Rhodococcus Prokaryotic Ubiquitin-Like Protein (Pup) Is Degraded by Deaminase of Pup (Dop) , 2012, Bioscience, biotechnology, and biochemistry.

[14]  Xiaowei Zhao,et al.  Prediction of Protein Phosphorylation Sites by Using the Composition of k-Spaced Amino Acid Pairs , 2012, PloS one.

[15]  Shao-Ping Shi,et al.  PLMLA: prediction of lysine methylation and lysine acetylation by combining multiple features. , 2012, Molecular bioSystems.

[16]  C. Tung PupDB: a database of pupylated proteins , 2012, BMC Bioinformatics.

[17]  Yu Xue,et al.  GPS-PUP: computational prediction of pupylation sites in prokaryotic proteins. , 2011, Molecular bioSystems.

[18]  Steven P Gygi,et al.  Reconstitution of the Mycobacterium tuberculosis pupylation pathway in Escherichia coli , 2011, EMBO reports.

[19]  Yong-Zi Chen,et al.  Prediction of Ubiquitination Sites by Using the Composition of k-Spaced Amino Acid Pairs , 2011, PloS one.

[20]  Chunaram Choudhary,et al.  Proteome-Wide Mapping of the Drosophila Acetylome Demonstrates a High Degree of Conservation of Lysine Acetylation , 2011, Science Signaling.

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  Michael Thommen,et al.  Mycobacterial Ubiquitin-like Protein Ligase PafA Follows a Two-step Reaction Pathway with a Phosphorylated Pup Intermediate* , 2010, The Journal of Biological Chemistry.

[23]  M. Sutter,et al.  Dop functions as a depupylase in the prokaryotic ubiquitin‐like modification pathway , 2010, EMBO reports.

[24]  Kai Stühler,et al.  Proteome-wide identification of mycobacterial pupylation targets , 2010, Molecular systems biology.

[25]  D. Kraut,et al.  Pup grows up: in vitro characterization of the degradation of pupylated proteins , 2010, The EMBO journal.

[26]  K. Burns,et al.  Pupylation versus ubiquitylation: tagging for proteasome‐dependent degradation , 2010, Cellular microbiology.

[27]  F. Damberger,et al.  Prokaryotic ubiquitin-like protein (Pup) is coupled to substrates via the side chain of its C-terminal glutamate. , 2010, Journal of the American Chemical Society.

[28]  Geoffrey I. Webb,et al.  Cascleave: towards more accurate prediction of caspase substrate cleavage sites , 2010, Bioinform..

[29]  Vineet Bafna,et al.  Expansion of the mycobacterial "PUPylome". , 2010, Molecular bioSystems.

[30]  Julian Mintseris,et al.  Prokayrotic Ubiquitin-Like Protein (Pup) Proteome of Mycobacterium tuberculosis , 2010, PloS one.

[31]  Ziding Zhang,et al.  TIM-Finder: A new method for identifying TIM-barrel proteins , 2009, BMC Structural Biology.

[32]  Chuan Wang,et al.  DescFold: A web server for protein fold recognition , 2009, BMC Bioinformatics.

[33]  Ling-Yun Wu,et al.  Prediction of palmitoylation sites using the composition of k-spaced amino acid pairs. , 2009, Protein engineering, design & selection : PEDS.

[34]  F. Damberger,et al.  A distinct structural region of the prokaryotic ubiquitin‐like protein (Pup) is recognized by the N‐terminal domain of the proteasomal ATPase Mpa , 2009, FEBS letters.

[35]  K. Walters,et al.  Prokaryotic ubiquitin-like protein pup is intrinsically disordered. , 2009, Journal of molecular biology.

[36]  Xiaoming Tu,et al.  Pup, a prokaryotic ubiquitin-like protein, is an intrinsically disordered protein. , 2009, The Biochemical journal.

[37]  M. Sutter,et al.  Bacterial ubiquitin-like modifier Pup is deamidated and conjugated to substrates by distinct but homologous enzymes , 2009, Nature Structural &Molecular Biology.

[38]  K. Darwin,et al.  Prokaryotic ubiquitin-like protein (Pup), proteasomes and pathogenesis , 2009, Nature Reviews Microbiology.

[39]  G. Demartino PUPylation: something old, something new, something borrowed, something Glu. , 2009, Trends in biochemical sciences.

[40]  Lukasz A. Kurgan,et al.  Prediction of integral membrane protein type by collocated hydrophobic amino acid pairs , 2009, J. Comput. Chem..

[41]  Foster Provost,et al.  Machine Learning from Imbalanced Data Sets 101 , 2008 .

[42]  S. Gygi,et al.  Ubiquitin-Like Protein Involved in the Proteasome Pathway of Mycobacterium tuberculosis , 2008, Science.

[43]  P. Salgame PUPylation provides the punch as Mycobacterium tuberculosis battles the host macrophage. , 2008, Cell host & microbe.

[44]  L. Aravind,et al.  Unraveling the biochemistry and provenance of pupylation: a prokaryotic analog of ubiquitination , 2008, Biology Direct.

[45]  V. Schreiber,et al.  The expanding field of poly(ADP-ribosyl)ation reactions. ‘Protein Modifications: Beyond the Usual Suspects' Review Series , 2008, EMBO reports.

[46]  Ivan Dikic,et al.  Atypical ubiquitin chains: new molecular signals , 2008, EMBO reports.

[47]  Ziding Zhang,et al.  Prediction of mucin-type O-glycosylation sites in mammalian proteins using the composition of k-spaced amino acid pairs , 2008, BMC Bioinformatics.

[48]  Ke Chen,et al.  Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs , 2007, BMC Structural Biology.

[49]  Vladimir Vacic,et al.  Two Sample Logo: a graphical representation of the differences between two sets of sequence alignments , 2006, Bioinform..

[50]  A. Goldberg Nobel Committee Tags Ubiquitin for Distinction , 2005, Neuron.

[51]  Michael Gribskov,et al.  Use of Receiver Operating Characteristic (ROC) Analysis to Evaluate Sequence Matching , 1996, Comput. Chem..

[52]  Nick Pacf,et al.  Protein and peptide letters: editors Ben Dunn and Laurence Pearl, Bentham Science Publishers B.V., $60.00 (individual); $155.00 (institutional) , 1995 .

[53]  Xing-Ming Zhao,et al.  Cascleave 2.0, a new approach for predicting caspase and granzyme cleavage targets , 2014, Bioinform..

[54]  F. Striebel,et al.  Pupylation as a signal for proteasomal degradation in bacteria. , 2014, Biochimica et biophysica acta.

[55]  K. Burns,et al.  Pupylation : A Signal for Proteasomal Degradation in Mycobacterium tuberculosis. , 2010, Sub-cellular biochemistry.

[56]  Burns Ke,et al.  Pupylation : A Signal for Proteasomal Degradation in Mycobacterium tuberculosis. , 2010 .

[57]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[58]  A. Goldberg,et al.  A soluble ATP-dependent proteolytic system responsible for the degradation of abnormal proteins in reticulocytes. , 1977, Proceedings of the National Academy of Sciences of the United States of America.