dbPTM 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications

Protein modification is an extremely important post-translational regulation that adjusts the physical and chemical properties, conformation, stability and activity of a protein; thus altering protein function. Due to the high throughput of mass spectrometry (MS)-based methods in identifying site-specific post-translational modifications (PTMs), dbPTM (http://dbPTM.mbc.nctu.edu.tw/) is updated to integrate experimental PTMs obtained from public resources as well as manually curated MS/MS peptides associated with PTMs from research articles. Version 3.0 of dbPTM aims to be an informative resource for investigating the substrate specificity of PTM sites and functional association of PTMs between substrates and their interacting proteins. In order to investigate the substrate specificity for modification sites, a newly developed statistical method has been applied to identify the significant substrate motifs for each type of PTMs containing sufficient experimental data. According to the data statistics in dbPTM, >60% of PTM sites are located in the functional domains of proteins. It is known that most PTMs can create binding sites for specific protein-interaction domains that work together for cellular function. Thus, this update integrates protein–protein interaction and domain–domain interaction to determine the functional association of PTM sites located in protein-interacting domains. Additionally, the information of structural topologies on transmembrane (TM) proteins is integrated in dbPTM in order to delineate the structural correlation between the reported PTM sites and TM topologies. To facilitate the investigation of PTMs on TM proteins, the PTM substrate sites and the structural topology are graphically represented. Also, literature information related to PTMs, orthologous conservations and substrate motifs of PTMs are also provided in the resource. Finally, this version features an improved web interface to facilitate convenient access to the resource.

[1]  Hsien-Da Huang,et al.  N‐Ace: Using solvent accessibility and physicochemical properties to identify protein N‐acetylation sites , 2010, J. Comput. Chem..

[2]  Hongfang Liu,et al.  dbOGAP - An Integrated Bioinformatics Resource for Protein O-GlcNAcylation , 2011, BMC Bioinformatics.

[3]  Tzong-Yi Lee,et al.  Carboxylator: incorporating solvent-accessible surface area for identifying protein carboxylation sites , 2011, J. Comput. Aided Mol. Des..

[4]  Angel Herráez,et al.  Biomolecules in the computer: Jmol to the rescue , 2006, Biochemistry and molecular biology education : a bimonthly publication of the International Union of Biochemistry and Molecular Biology.

[5]  P. Kuo,et al.  Differential effects of nitric oxide-mediated S-nitrosylation on p50 and c-jun DNA binding. , 1998, Surgery.

[6]  See-Kiong Ng,et al.  InterDom: a database of putative interacting protein domains for validating predicted protein interactions and complexes , 2003, Nucleic Acids Res..

[7]  Alexey G. Murzin,et al.  Structure of the HP1 chromodomain bound to histone H3 methylated at lysine 9 , 2002, Nature.

[8]  Rachael P. Huntley,et al.  QuickGO: a web-based tool for Gene Ontology searching , 2009, Bioinform..

[9]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[10]  Bin Zhang,et al.  PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse , 2011, Nucleic Acids Res..

[11]  Dan Su,et al.  Structural basis for recognition of H3K56-acetylated histone H3–H4 by the chaperone Rtt106 , 2012, Nature.

[12]  Allegra Via,et al.  Phospho.ELM: a database of phosphorylation sites—update 2008 , 2008, Nucleic Acids Res..

[13]  P. Evans,et al.  The structural basis for the recognition of acetylated histone H4 by the bromodomain of histone acetyltransferase Gcn5p , 2000, The EMBO journal.

[14]  Y. Zhang,et al.  IntAct—open source resource for molecular interaction data , 2006, Nucleic Acids Res..

[15]  Shigeyuki Yokoyama,et al.  Structural Basis for Acetylated Histone H4 Recognition by the Human BRD2 Bromodomain* , 2010, The Journal of Biological Chemistry.

[16]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..

[17]  Gabriel Waksman,et al.  Molecular recognition by SH2 domains. , 2002, Advances in protein chemistry.

[18]  Shandar Ahmad,et al.  RVP-net: online prediction of real valued accessible surface area of proteins from single sequences , 2003, Bioinform..

[19]  Akinori Sarai,et al.  ProTherm: Thermodynamic Database for Proteins and Mutants , 1999, Nucleic Acids Res..

[20]  Jorng-Tzong Horng,et al.  KinasePhos: a web tool for identifying protein kinase-specific phosphorylation sites , 2005, Nucleic Acids Res..

[21]  Chun-Wei Tung PupDB: a database of pupylated proteins , 2012, BMC Bioinformatics.

[22]  Joachim Selbig,et al.  PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor , 2007, Nucleic Acids Res..

[23]  Eric J. Toone,et al.  (S)NO Signals: Translocation, Regulation, and a Consensus Motif , 1997, Neuron.

[24]  María Martín,et al.  The Gene Ontology: enhancements for 2011 , 2011, Nucleic Acids Res..

[25]  D. Durocher,et al.  The molecular basis of FHA domain:phosphopeptide binding specificity and implications for phospho-dependent signaling mechanisms. , 2000, Molecular cell.

[26]  Daniel C Liebler,et al.  Identification of S-nitrosylation motifs by site-specific mapping of the S-nitrosocysteine proteome in human vascular smooth muscle cells. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[28]  Kong-Joo Lee,et al.  Post-translational modifications and their biological functions: proteomic analysis and systematic approaches. , 2004, Journal of biochemistry and molecular biology.

[29]  Tony Pawson,et al.  NetworKIN: a resource for exploring cellular phosphorylation networks , 2007, Nucleic Acids Res..

[30]  Xing Wang Deng,et al.  Structural basis for the specific recognition of methylated histone H3 lysine 4 by the WD-40 protein WDR5. , 2006, Molecular cell.

[31]  Hsien-Da Huang,et al.  RegPhos: a system to explore the protein kinase–substrate phosphorylation network in humans , 2010, Nucleic Acids Res..

[32]  David T. Jones,et al.  Transmembrane protein topology prediction using support vector machines , 2009, BMC Bioinformatics.

[33]  Christian von Mering,et al.  STRING 7—recent developments in the integration and prediction of protein interactions , 2006, Nucleic Acids Res..

[34]  Tzong-Yi Lee,et al.  PlantPhos: using maximal dependence decomposition to identify plant phosphorylation sites with substrate site specificity , 2011, BMC Bioinformatics.

[35]  Jorng-Tzong Horng,et al.  Incorporating structural characteristics for identification of protein methylation sites , 2009, J. Comput. Chem..

[36]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[37]  G. K. Ackers,et al.  Effects of site-specific amino acid modification on protein interactions and biological function. , 1985, Annual review of biochemistry.

[38]  Fabien Campagne,et al.  SNOSID, a proteomic method for identification of cysteine S-nitrosylation sites in complex protein mixtures. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Thomas A. Milne,et al.  WDR5 Associates with Histone H3 Methylated at K4 and Is Essential for H3 K4 Methylation and Vertebrate Development , 2005, Cell.

[40]  Gennady M Verkhivker,et al.  Hierarchy of simulation models in predicting molecular recognition mechanisms from the binding energy landscapes: Structural analysis of the peptide complexes with SH2 domains , 2001, Proteins.

[41]  R. Durbin,et al.  Pfam: A comprehensive database of protein domain families based on seed alignments , 1997, Proteins.

[42]  Hsien-Da Huang,et al.  dbSNO: a database of cysteine S-nitrosylation , 2012, Bioinform..

[43]  S. Kennedy,et al.  Structures of membrane proteins , 1978, The Journal of Membrane Biology.

[44]  Søren Brunak,et al.  O-GLYCBASE version 4.0: a revised database of O-glycosylated proteins , 1999, Nucleic Acids Res..

[45]  Cathryn M. Gould,et al.  Phospho.ELM: a database of phosphorylation sites—update 2011 , 2010, Nucleic acids research.

[46]  Tzong-Yi Lee,et al.  Identifying Protein Phosphorylation Sites with Kinase Substrate Specificity on Human Viruses , 2012, PloS one.

[47]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[48]  Akinori Sarai,et al.  Thermodynamic database for protein-nucleic acid interactions (ProNIT) , 2001, Bioinform..

[49]  Hsien-Da Huang,et al.  A comprehensive resource for integrating and displaying protein post-translational modifications , 2009, BMC Research Notes.

[50]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[51]  Andrei L. Lomize,et al.  OPM: Orientations of Proteins in Membranes database , 2006, Bioinform..

[52]  Amos Bairoch,et al.  Annotation of post‐translational modifications in the Swiss‐Prot knowledge base , 2004, Proteomics.

[53]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[54]  A. Bairoch PROSITE: a dictionary of sites and patterns in proteins. , 1991, Nucleic acids research.

[55]  G. Heijne,et al.  Genome‐wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms , 1998, Protein science : a publication of the Protein Society.

[56]  Wei-Chi Ku,et al.  S-alkylating labeling strategy for site-specific identification of the s-nitrosoproteome. , 2010, Journal of proteome research.

[57]  M. Michael Gromiha,et al.  PINT: Protein–protein Interactions Thermodynamic Database , 2005, Nucleic Acids Res..

[58]  Alejandro Garcia,et al.  UbiProt: a database of ubiquitylated proteins , 2007, BMC Bioinformatics.

[59]  Hsien-Da Huang,et al.  SNOSite: Exploiting Maximal Dependence Decomposition to Identify Cysteine S-Nitrosylation with Substrate Site Specificity , 2011, PloS one.

[60]  Florian Gnad,et al.  PHOSIDA 2011: the posttranslational modification database , 2010, Nucleic Acids Res..

[61]  Hsien-Da Huang,et al.  KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns , 2007, Nucleic Acids Res..

[62]  Hsien-Da Huang,et al.  dbPTM: an information repository of protein post-translational modification , 2005, Nucleic Acids Res..

[63]  M. Mann,et al.  PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites , 2007, Genome Biology.

[64]  Zsuzsanna Dosztányi,et al.  PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank , 2004, Nucleic Acids Res..

[65]  Jorng-Tzong Horng,et al.  Incorporating hidden Markov models for identifying protein kinase‐specific phosphorylation sites , 2005, J. Comput. Chem..

[66]  Jérôme Gouzy,et al.  The ProDom database of protein domain families , 1998, Nucleic Acids Res..

[67]  István Simon,et al.  TOPDB: topology data bank of transmembrane proteins , 2007, Nucleic Acids Res..

[68]  M. Mann,et al.  Proteomic analysis of post-translational modifications , 2003, Nature Biotechnology.

[69]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[70]  S. Gross,et al.  S-Nitrosylation Is Emerging as a Specific and Fundamental Posttranslational Protein Modification: Head-to-Head Comparison with O-Phosphorylation , 2001, Science's STKE.

[71]  Yixue Li,et al.  SysPTM: A Systematic Resource for Proteomic Research on Post-translational Modifications* , 2009, Molecular & Cellular Proteomics.

[72]  John S Garavelli,et al.  The RESID Database of Protein Modifications as a resource and annotation tool , 2004, Proteomics.

[73]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[74]  Wen-Lian Hsu,et al.  TMPad: an integrated structural database for helix-packing folds in transmembrane proteins , 2010, Nucleic Acids Res..

[75]  Tzong-Yi Lee,et al.  Exploiting maximal dependence decomposition to identify conserved motifs from a group of aligned signal sequences , 2011, Bioinform..

[76]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[77]  T. Attwood,et al.  PRINTS--a database of protein motif fingerprints. , 1994, Nucleic acids research.

[78]  Yu-Yen Ou,et al.  Incorporating significant amino acid pairs to identify O-linked glycosylation sites on transmembrane proteins and non-transmembrane proteins , 2010, BMC Bioinformatics.

[79]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[80]  David S. Goodsell,et al.  The RCSB Protein Data Bank: redesigned web site and web services , 2010, Nucleic Acids Res..

[81]  P. Kennelly,et al.  The Phosphorylation Site Database: A guide to the serine‐, threonine‐, and/or tyrosine‐phosphorylated proteins in prokaryotic organisms , 2004, Proteomics.

[82]  T. Pawson,et al.  Reading protein modifications with interaction domains , 2006, Nature Reviews Molecular Cell Biology.