ADPredict: ADP-ribosylation site prediction based on physicochemical and structural descriptors

Motivation: ADP‐ribosylation is a post‐translational modification (PTM) implicated in several crucial cellular processes, ranging from regulation of DNA repair and chromatin structure to cell metabolism and stress responses. To date, a complete understanding of ADP‐ribosylation targets and their modification sites in different tissues and disease states is still lacking. Identification of ADP‐ribosylation sites is required to discern the molecular mechanisms regulated by this modification. This motivated us to develop a computational tool for the prediction of ADP‐ribosylated sites. Results: Here, we present ADPredict, the first dedicated computational tool for the prediction of ADP‐ribosylated aspartic and glutamic acids. This predictive algorithm is based on (i) physicochemical properties, (ii) in‐house designed secondary structure‐related descriptors and (iii) three‐dimensional features of a set of human ADP‐ribosylated proteins that have been reported in the literature. ADPredict was developed using principal component analysis and machine learning techniques; its performance was evaluated both internally via intensive bootstrapping and in predicting two external experimental datasets. It outperformed the only other available ADP‐ribosylation prediction tool, ModPred. Moreover, a novel secondary structure descriptor, HM‐ratio, was introduced and successfully contributed to the model development, thus representing a promising tool for bioinformatics studies, such as PTM prediction. Availability and implementation: ADPredict is freely available at www.ADPredict.net. Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Binghuang Cai,et al.  Computational methods for ubiquitination site prediction using physicochemical properties of protein sequences , 2016, BMC Bioinformatics.

[2]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[3]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[4]  M. Mann,et al.  Status of Large-scale Analysis of Post-translational Modifications by Mass Spectrometry* , 2013, Molecular & Cellular Proteomics.

[5]  G. Vistoli,et al.  An insight into the skin penetration enhancement mechanism of N-methylpyrrolidone. , 2014, Molecular pharmaceutics.

[6]  G. Vistoli,et al.  Isoxazole derivatives as potent transient receptor potential melastatin type 8 (TRPM8) agonists. , 2013, European journal of medicinal chemistry.

[7]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[8]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[9]  Yonghao Yu,et al.  Chemical genetic discovery of PARP targets reveals a role for PARP-1 in transcription elongation , 2016, Science.

[10]  Jiuqiang Han,et al.  ADPRtool: A novel predicting model for identification of ASP-ADP-Ribosylation sites of human proteins , 2015, J. Bioinform. Comput. Biol..

[11]  I. Matic,et al.  Family-wide analysis of poly(ADP-ribose) polymerase activity , 2014, Nature Communications.

[12]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[13]  Erik Johansson,et al.  Megavariate analysis of environmental QSAR data. Part I – A basic framework founded on principal component analysis (PCA), partial least squares (PLS), and statistical molecular design (SMD) , 2006, Molecular Diversity.

[14]  Jamal Shamsara,et al.  Evaluation of 11 Scoring Functions Performance on Matrix Metalloproteinases , 2014, International journal of medicinal chemistry.

[15]  Anthony K. L. Leung,et al.  Phosphoproteomic Approach to Characterize Protein Mono- and Poly(ADP-ribosyl)ation Sites from Cells , 2014, Journal of proteome research.

[16]  S. Wold,et al.  New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. , 1998, Journal of medicinal chemistry.

[17]  I. Ahel,et al.  ADP‐ribosylation: new facets of an ancient modification , 2017, The FEBS journal.

[18]  S. Ong,et al.  ADP-Ribosylated Peptide Enrichment and Site Identification: The Phosphodiesterase-Based Method. , 2017, Methods in molecular biology.

[20]  Kuo-Chen Chou,et al.  pSuc-Lys: Predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach. , 2016, Journal of theoretical biology.

[21]  I. Matic,et al.  Serine ADP-Ribosylation Depends on HPF1 , 2017, Molecular cell.

[22]  G. Vistoli,et al.  Exploring the activation mechanism of TRPM8 channel by targeted MD simulations. , 2011, Biochemical and biophysical research communications.

[23]  Wenjiang J. Fu,et al.  Estimating misclassification error with small samples via bootstrap cross-validation , 2005, Bioinform..

[24]  Ricky Wat,et al.  ADPriboDB: The database of ADP-ribosylated proteins , 2016, bioRxiv.

[25]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[26]  Ling-Yun Wu,et al.  iSulf-Cys: Prediction of S-sulfenylation Sites in Proteins with Physicochemical Properties of Amino Acids , 2016, PloS one.

[27]  F. Koch-Nolte,et al.  ADP-ribosylation of arginine , 2010, Amino Acids.

[28]  M. Mann,et al.  Proteomic analysis of post-translational modifications , 2003, Nature Biotechnology.

[29]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[30]  Michal Linial,et al.  ASAP: a machine learning framework for local protein properties , 2015, bioRxiv.

[31]  D. Filippov,et al.  New Quantitative Mass Spectrometry Approaches Reveal Different ADP-ribosylation Phases Dependent On the Levels of Oxidative Stress* , 2017, Molecular & Cellular Proteomics.

[32]  David R Goodlett,et al.  Mapping PARP-1 auto-ADP-ribosylation sites by liquid chromatography-tandem mass spectrometry. , 2013, Journal of proteome research.

[33]  T. Baubec,et al.  Analysis of Chromatin ADP-Ribosylation at the Genome-wide Level and at Specific Loci by ADPr-ChAP. , 2016, Molecular cell.

[34]  Predrag Radivojac,et al.  The structural and functional signatures of proteins that undergo multiple events of post‐translational modification , 2014, Protein science : a publication of the Protein Society.

[35]  J. Masson,et al.  The RNF138 E3 ligase displaces Ku to promote DNA end resection and regulate DNA repair pathway choice , 2015, Nature Cell Biology.

[36]  Simona Distinto,et al.  Evaluation of the performance of 3D virtual screening protocols: RMSD comparisons, enrichment assessments, and decoy selection—What can we learn from earlier mistakes? , 2008, J. Comput. Aided Mol. Des..

[37]  Yonghao Yu,et al.  Site-specific characterization of the Asp- and Glu-ADP-ribosylated proteome , 2013, Nature Methods.

[38]  A. Caflisch,et al.  PARP1 ADP-ribosylates lysine residues of the core histone tails , 2010, Nucleic acids research.

[39]  M. L. Nielsen,et al.  Proteome-Wide Identification of In Vivo ADP-Ribose Acceptor Sites by Liquid Chromatography-Tandem Mass Spectrometry. , 2017, Methods in molecular biology.

[40]  M. L. Nielsen,et al.  Proteome-wide identification of the endogenous ADP-ribosylome of mammalian cells and tissue , 2016, Nature Communications.

[41]  Christian Panse,et al.  Combining Higher-Energy Collision Dissociation and Electron-Transfer/Higher-Energy Collision Dissociation Fragmentation in a Product-Dependent Manner Confidently Assigns Proteomewide ADP-Ribose Acceptor Sites. , 2017, Analytical chemistry.

[42]  Kuo-Chen Chou,et al.  Identification of protein-protein binding sites by incorporating the physicochemical properties and stationary wavelet transforms into pseudo amino acid composition , 2016, Journal of biomolecular structure & dynamics.

[43]  Ziying Liu,et al.  PARPs and ADP-ribosylation: recent advances linking molecular functions to biological outcomes , 2017, Genes & development.

[44]  Geoffrey I. Webb,et al.  GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome , 2015, Bioinform..

[45]  Kuo-Chen Chou,et al.  iROS-gPseKNC: Predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition , 2016, Oncotarget.

[46]  M. Shu,et al.  ST-scale as a novel amino acid descriptor and its application in QSAM of peptides and analogues , 2010, Amino Acids.

[47]  Wendy A. Warr,et al.  Scientific workflow systems: Pipeline Pilot and KNIME , 2012, Journal of Computer-Aided Molecular Design.

[48]  Gerard J. P. van Westen,et al.  Benchmarking of protein descriptor sets in proteochemometric modeling (part 1): comparative study of 13 amino acid descriptor sets , 2013, Journal of Cheminformatics.