ProteomeScout: a repository and analysis resource for post-translational modifications and proteins

ProteomeScout (https://proteomescout.wustl.edu) is a resource for the study of proteins and their post-translational modifications (PTMs) consisting of a database of PTMs, a repository for experimental data, an analysis suite for PTM experiments, and a tool for visualizing the relationships between complex protein annotations. The PTM database is a compendium of public PTM data, coupled with user-uploaded experimental data. ProteomeScout provides analysis tools for experimental datasets, including summary views and subset selection, which can identify relationships within subsets of data by testing for statistically significant enrichment of protein annotations. Protein annotations are incorporated in the ProteomeScout database from external resources and include terms such as Gene Ontology annotations, domains, secondary structure and non-synonymous polymorphisms. These annotations are available in the database download, in the analysis tools and in the protein viewer. The protein viewer allows for the simultaneous visualization of annotations in an interactive web graphic, which can be exported in Scalable Vector Graphics (SVG) format. Finally, quantitative data measurements associated with public experiments are also easily viewable within protein records, allowing researchers to see how PTMs change across different contexts. ProteomeScout should prove useful for protein researchers and should benefit the proteomics community by providing a stable repository for PTM experiments.

[1]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[2]  Sebastian A. Wagner,et al.  A Proteome-wide, Quantitative Survey of In Vivo Ubiquitylation Sites Reveals Widespread Regulatory Roles* , 2011, Molecular & Cellular Proteomics.

[3]  Allegra Via,et al.  Phospho.ELM: a database of phosphorylation sites—update 2008 , 2008, Nucleic Acids Res..

[4]  Johannes Griss,et al.  Published and Perished? The Influence of the Searched Protein Database on the Long-Term Storage of Proteomics Data , 2011, Molecular & Cellular Proteomics.

[5]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[6]  Scott M. Carlson,et al.  A general molecular affinity strategy for global detection and proteomic analysis of lysine methylation. , 2013, Molecular cell.

[7]  Ni Li,et al.  Gene Ontology Annotations and Resources , 2012, Nucleic Acids Res..

[8]  D. Lauffenburger,et al.  Multiple reaction monitoring for robust quantitative proteomic analysis of cellular signaling networks , 2007, Proceedings of the National Academy of Sciences.

[9]  David Haussler,et al.  The UCSC genome browser database: update 2007 , 2006, Nucleic Acids Res..

[10]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[11]  Søren Brunak,et al.  O-GLYCBASE version 4.0: a revised database of O-glycosylated proteins , 1999, Nucleic Acids Res..

[12]  David Haussler,et al.  The UCSC Proteome Browser , 2004, Nucleic Acids Res..

[13]  Kristen M. Naegle,et al.  Robust co-regulation of tyrosine phosphorylation sites on proteins reveals novel protein interactions. , 2012, Molecular bioSystems.

[14]  Christian von Mering,et al.  Fifteen years SIB Swiss Institute of Bioinformatics: life science databases, tools and support , 2014, Nucleic Acids Res..

[15]  R. Henrik Nilsson,et al.  Finding needles in haystacks: linking scientific names, reference specimens and molecular data for Fungi , 2014, Database J. Biol. Databases Curation.

[16]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[17]  F. White,et al.  Temporal Dynamics of Tyrosine Phosphorylation in Insulin Signaling , 2006, Diabetes.

[18]  Cathryn M. Gould,et al.  Phospho.ELM: a database of phosphorylation sites—update 2011 , 2010, Nucleic acids research.

[19]  Michael B. Yaffe,et al.  Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs , 2003, Nucleic Acids Res..

[20]  David S. Goodsell,et al.  The RCSB Protein Data Bank: new resources for research and education , 2012, Nucleic Acids Res..

[21]  Rachael P. Huntley,et al.  QuickGO: a web-based tool for Gene Ontology searching , 2009, Bioinform..

[22]  K. Resing,et al.  Mapping protein post-translational modifications with mass spectrometry , 2007, Nature Methods.

[23]  Pierrick Craveur,et al.  PTM-SD: a database of structurally resolved and annotated posttranslational modifications in proteins , 2014, Database J. Biol. Databases Curation.

[24]  D. Wallach,et al.  How do cells sense foreign DNA? A new outlook on the function of STING. , 2013, Molecular cell.

[25]  Hsien-Da Huang,et al.  dbPTM 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications , 2012, Nucleic Acids Res..

[26]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[27]  Peer Bork,et al.  PTMcode: a database of known and predicted functional associations between post-translational modifications in proteins , 2012, Nucleic Acids Res..

[28]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[29]  Bin Zhang,et al.  PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse , 2011, Nucleic Acids Res..

[30]  Florian Gnad,et al.  PHOSIDA 2011: the posttranslational modification database , 2010, Nucleic Acids Res..

[31]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[32]  David Haussler,et al.  The UCSC Genome Browser database: 2014 update , 2013, Nucleic Acids Res..

[33]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[34]  Kristen M. Naegle,et al.  An integrated comparative phosphoproteomic and bioinformatic approach reveals a novel class of MPM-2 motifs upregulated in EGFRvIII-expressing glioblastoma cells. , 2008, Molecular bioSystems.

[35]  Juan Antonio Vizcaíno,et al.  Improvements in the protein identifier cross-reference service , 2012, Nucleic Acids Res..

[36]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[37]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[38]  Roy E. Welsch,et al.  MCAM: Multiple Clustering Analysis Methodology for Deriving Hypotheses and Insights from High-Throughput Proteomic Datasets , 2011, PLoS Comput. Biol..

[39]  Kristen M. Naegle,et al.  PTMScout, a Web Resource for Analysis of High Throughput Post-translational Proteomics Studies* , 2010, Molecular & Cellular Proteomics.

[40]  Steven P Gygi,et al.  Signaling networks assembled by oncogenic EGFR and c-Met , 2008, Proceedings of the National Academy of Sciences.

[41]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[42]  Kristen M. Naegle,et al.  Phosphoproteomics of collagen receptor networks reveals SHP-2 phosphorylation downstream of wild-type DDR2 and its lung cancer mutants , 2013, The Biochemical journal.

[43]  Maria Jesus Martin,et al.  Dasty3, a WEB framework for DAS , 2011, Bioinform..

[44]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..