DAPPLE 2: a Tool for the Homology-Based Prediction of Post-Translational Modification Sites.

The post-translational modification of proteins is critical for regulating their function. Although many post-translational modification sites have been experimentally determined, particularly in certain model organisms, experimental knowledge of these sites is severely lacking for many species. Thus, it is important to be able to predict sites of post-translational modification in such species. Previously, we described DAPPLE, a tool that facilitates the homology-based prediction of one particular post-translational modification, phosphorylation, in an organism of interest using known phosphorylation sites from other organisms. Here, we describe DAPPLE 2, which expands and improves upon DAPPLE in three major ways. First, it predicts sites for many post-translational modifications (20 different types) using data from several sources (15 online databases). Second, it has the ability to make predictions approximately 2-7 times faster than DAPPLE depending on the database size and the organism of interest. Third, it simplifies and accelerates the process of selecting predicted sites of interest by categorizing them based on gene ontology terms, keywords, and signaling pathways. We show that DAPPLE 2 can successfully predict known human post-translational modification sites using, as input, known sites from species that are either closely (e.g., mouse) or distantly (e.g., yeast) related to humans. DAPPLE 2 can be accessed at http://saphire.usask.ca/saphire/dapple2 .

[1]  Chaochun Wei,et al.  LAceP: Lysine Acetylation Site Prediction Using Logistic Regression Classifiers , 2014, PloS one.

[2]  Hsien-Da Huang,et al.  N‐Ace: Using solvent accessibility and physicochemical properties to identify protein N‐acetylation sites , 2010, J. Comput. Chem..

[3]  T. Hunter The age of crosstalk: phosphorylation, ubiquitination, and beyond. , 2007, Molecular cell.

[4]  Hsien-Da Huang,et al.  dbSNO: a database of cysteine S-nitrosylation , 2012, Bioinform..

[5]  Anthony J. Kusalik,et al.  DAPPLE: a pipeline for the homology-based prediction of phosphorylation sites , 2013, Bioinform..

[6]  Alejandro Garcia,et al.  UbiProt: a database of ubiquitylated proteins , 2007, BMC Bioinformatics.

[7]  Qiuming Yao,et al.  P3DB: An Integrated Database for Plant Protein Phosphorylation , 2012, Front. Plant Sci..

[8]  Anthony J. Kusalik,et al.  Case study: using sequence homology to identify putative phosphorylation sites in an evolutionarily distant species (honeybee) , 2015, Briefings Bioinform..

[9]  V. Vacic,et al.  Identification, analysis, and prediction of protein ubiquitination sites , 2010, Proteins.

[10]  Rune Linding,et al.  Cells, shared memory and breaking the PTM code , 2012, Molecular systems biology.

[11]  Yu Xue,et al.  GPS 2.0, a Tool to Predict Kinase-specific Phosphorylation Sites in Hierarchy *S , 2008, Molecular & Cellular Proteomics.

[12]  Bin Zhang,et al.  PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse , 2011, Nucleic Acids Res..

[13]  Zexian Liu,et al.  GPS-SNO: Computational Prediction of Protein S-Nitrosylation Sites with a Modified GPS Algorithm , 2010, PloS one.

[14]  Sun-Yuan Kung,et al.  Sparse regressions for predicting and interpreting subcellular localization of multi-label proteins , 2016, BMC Bioinformatics.

[15]  S. Brunak,et al.  Analysis and prediction of mammalian protein glycation. , 2006, Glycobiology.

[16]  Yongan Zhao,et al.  RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data , 2011, Bioinform..

[17]  M. Yaffe,et al.  A motif-based profile scanning approach for genome-wide prediction of signaling pathways , 2001, Nature Biotechnology.

[18]  Ashis Kumer Biswas,et al.  Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information , 2010, BMC Bioinformatics.

[19]  Tao Zhou,et al.  mUbiSiDa: A Comprehensive Database for Protein Ubiquitination Sites in Mammals , 2014, PloS one.

[20]  Jonathan D. Hirst,et al.  Prediction of glycosylation sites using random forests , 2008, BMC Bioinformatics.

[21]  Robert Schmidt,et al.  PhosPhAt: the Arabidopsis thaliana phosphorylation site database. An update , 2009, Nucleic Acids Res..

[22]  Kara Dolinski,et al.  The PhosphoGRID Saccharomyces cerevisiae protein phosphorylation site database: version 2.0 update , 2013, Database J. Biol. Databases Curation.

[23]  Lincoln Stein,et al.  Reactome: a database of reactions, pathways and biological processes , 2010, Nucleic Acids Res..

[24]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[25]  Peer Bork,et al.  PTMcode: a database of known and predicted functional associations between post-translational modifications in proteins , 2012, Nucleic Acids Res..

[26]  Joaquín Dopazo,et al.  PTMcode v2: a resource for functional associations of post-translational modifications within and between proteins , 2014, Nucleic Acids Res..

[27]  Florian Gnad,et al.  PHOSIDA 2011: the posttranslational modification database , 2010, Nucleic Acids Res..

[28]  Allegra Via,et al.  Phospho.ELM: a database of phosphorylation sites—update 2008 , 2007, Nucleic Acids Res..

[29]  Hsien-Da Huang,et al.  dbPTM 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications , 2012, Nucleic Acids Res..

[30]  Ming Lu,et al.  ASEB: a web server for KAT-specific acetylation site prediction , 2012, Nucleic Acids Res..

[31]  Chi-Ying F. Huang,et al.  PhosphoPOINT: a comprehensive human kinase interactome and phospho-protein database , 2008, ECCB.

[32]  V. Vogel,et al.  Extracellular Phosphorylation and Phosphorylated Proteins: Not Just Curiosities But Physiologically Important , 2012, Science Signaling.

[33]  F. Eisenhaber,et al.  pkaPS: prediction of protein kinase A phosphorylation sites with the simplified kinase-substrate binding model , 2007, Biology Direct.

[34]  G. Gill,et al.  Post-translational modification by the small ubiquitin-related modifier SUMO has big effects on transcription factor activity. , 2003, Current opinion in genetics & development.

[35]  Anthony J. Kusalik,et al.  Computational prediction of eukaryotic phosphorylation sites , 2011, Bioinform..

[36]  K. Moore The Biology and Enzymology of Protein Tyrosine O-Sulfation* , 2003, Journal of Biological Chemistry.

[37]  Yu Xue,et al.  CPLM: a database of protein lysine modifications , 2013, Nucleic Acids Res..