iPTMnet: an integrated resource for protein post-translational modification network discovery

Abstract Protein post-translational modifications (PTMs) play a pivotal role in numerous biological processes by modulating regulation of protein function. We have developed iPTMnet (http://proteininformationresource.org/iPTMnet) for PTM knowledge discovery, employing an integrative bioinformatics approach—combining text mining, data mining, and ontological representation to capture rich PTM information, including PTM enzyme-substrate-site relationships, PTM-specific protein-protein interactions (PPIs) and PTM conservation across species. iPTMnet encompasses data from (i) our PTM-focused text mining tools, RLIMS-P and eFIP, which extract phosphorylation information from full-scale mining of PubMed abstracts and full-length articles; (ii) a set of curated databases with experimentally observed PTMs; and iii) Protein Ontology that organizes proteins and PTM proteoforms, enabling their representation, annotation and comparison within and across species. Presently covering eight major PTM types (phosphorylation, ubiquitination, acetylation, methylation, glycosylation, S-nitrosylation, sumoylation and myristoylation), iPTMnet knowledgebase contains more than 654 500 unique PTM sites in over 62 100 proteins, along with more than 1200 PTM enzymes and over 24 300 PTM enzyme-substrate-site relations. The website supports online search, browsing, retrieval and visual analysis for scientific queries. Several examples, including functional interpretation of phosphoproteomic data, demonstrate iPTMnet as a gateway for visual exploration and systematic analysis of PTM networks and conservation, thereby enabling PTM discovery and hypothesis generation.

[1]  Lloyd M. Smith,et al.  Proteoform: a single term describing protein complexity , 2013, Nature Methods.

[2]  Joachim Selbig,et al.  PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor , 2007, Nucleic Acids Res..

[3]  Bin Zhang,et al.  PhosphoSitePlus, 2014: mutations, PTMs and recalibrations , 2014, Nucleic Acids Res..

[4]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[5]  Livia Perfetto,et al.  SIGNOR: a database of causal relationships between biological entities , 2015, Nucleic Acids Res..

[6]  Peter B. McGarvey,et al.  A comprehensive protein-centric ID mapping service for molecular data integration , 2011, Bioinform..

[7]  Akhilesh Pandey,et al.  Identifying novel targets of oncogenic EGF receptor signaling in lung cancer through global phosphoproteomics , 2015, Proteomics.

[8]  Amos Bairoch,et al.  The neXtProt knowledgebase on human proteins: 2017 update , 2016, Nucleic Acids Res..

[9]  Adolfo Saiardi,et al.  Why always lysine? The ongoing tale of one of the most modified amino acids. , 2016, Advances in biological regulation.

[10]  M. Mann,et al.  PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites , 2007, Genome Biology.

[11]  Allegra Via,et al.  Phospho.ELM: a database of phosphorylation sites—update 2008 , 2008, Nucleic Acids Res..

[12]  Zhiyong Lu,et al.  PubTator: a web-based text mining tool for assisting biocuration , 2013, Nucleic Acids Res..

[13]  Ulrich Stelzl,et al.  Studying post-translational modifications with protein interaction networks. , 2014, Current opinion in structural biology.

[14]  Søren Brunak,et al.  O-GLYCBASE version 4.0: a revised database of O-glycosylated proteins , 1999, Nucleic Acids Res..

[15]  Cathryn M. Gould,et al.  Phospho.ELM: a database of phosphorylation sites—update 2011 , 2010, Nucleic acids research.

[16]  Cathy H. Wu,et al.  RLIMS-P 2.0: A Generalizable Rule-Based Information Extraction System for Literature Mining of Protein Phosphorylation Information , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  H. Piwnica-Worms,et al.  Atypical PKC Phosphorylates PAR-1 Kinases to Regulate Localization and Activity , 2004, Current Biology.

[18]  Luisa Montecchi-Palazzi,et al.  The PSI-MOD community standard for representation of protein modification data , 2008, Nature Biotechnology.

[19]  Dong Xu,et al.  P3DB 3.0: From plant phosphorylation sites to protein networks , 2013, Nucleic Acids Res..

[20]  G. Hart,et al.  Dynamic interplay between O-glycosylation and O-phosphorylation of nucleocytoplasmic proteins: alternative glycosylation/phosphorylation of THR-58, a known mutational hot spot of c-Myc in lymphomas, is regulated by mitogens. , 2002, The Journal of biological chemistry.

[21]  Cathy H. Wu,et al.  Construction of phosphorylation interaction networks by text mining of full-length articles using the eFIP system , 2015, Database J. Biol. Databases Curation.

[22]  Phosphorylation of maize eukaryotic translation initiation factor on Ser2 by catalytic subunit CK2 , 2011, Molecular and Cellular Biochemistry.

[23]  Jürg Bähler,et al.  PomBase: a comprehensive online resource for fission yeast , 2011, Nucleic Acids Res..

[24]  Manabu Torii,et al.  RLIMS-P: an online text-mining tool for literature-based extraction of protein phosphorylation information , 2014, Database J. Biol. Databases Curation.

[25]  John S Garavelli,et al.  The RESID Database of Protein Modifications as a resource and annotation tool , 2004, Proteomics.

[26]  Zhuo Shen,et al.  Large-scale analysis of phosphorylated proteins in maize leaf , 2011, Planta.

[27]  E. Mandelkow,et al.  MARKK, a Ste20‐like kinase, activates the polarity‐inducing kinase MARK/PAR‐1 , 2003, The EMBO journal.

[28]  Yu-Ju Chen,et al.  dbSNO 2.0: a resource for exploring structural environment, functional and disease association and regulatory network of protein S-nitrosylation , 2014, Nucleic Acids Res..

[29]  Cathy H. Wu,et al.  iPTMnet: Integrative Bioinformatics for Studying PTM Networks. , 2017, Methods in molecular biology.

[30]  Alejandro Garcia,et al.  UbiProt: a database of ubiquitylated proteins , 2007, BMC Bioinformatics.

[31]  H. Paulson,et al.  Ubiquitin pathways in neurodegenerative disease , 2014, Front. Mol. Neurosci..

[32]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[33]  Jérôme Boudeau,et al.  LKB1 is a master kinase that activates 13 kinases of the AMPK subfamily, including MARK/PAR‐1 , 2004, The EMBO journal.

[34]  Joaquín Dopazo,et al.  PTMcode v2: a resource for functional associations of post-translational modifications within and between proteins , 2014, Nucleic Acids Res..

[35]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[36]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[37]  Hsien-Da Huang,et al.  dbPTM 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications , 2012, Nucleic Acids Res..

[38]  G. Kroemer,et al.  Mutant MyoD Lacking Cdc2 Phosphorylation Sites Delays M-Phase Entry , 2004, Molecular and Cellular Biology.

[39]  Kiyoko F. Aoki-Kinoshita,et al.  UniCarbKB: building a knowledge platform for glycoproteomics , 2013, Nucleic Acids Res..

[40]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[41]  Kara Dolinski,et al.  The PhosphoGRID Saccharomyces cerevisiae protein phosphorylation site database: version 2.0 update , 2013, Database J. Biol. Databases Curation.

[42]  Ruedi Aebersold,et al.  PhosphoPep—a database of protein phosphorylation sites in model organisms , 2008, Nature Biotechnology.

[43]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.