2016 update of the PRIDE database and its related tools

The PRoteomics IDEntifications (PRIDE) database is one of the world-leading data repositories of mass spectrometry (MS)-based proteomics data. Since the beginning of 2014, PRIDE Archive (http://www.ebi.ac.uk/pride/archive/) is the new PRIDE archival system, replacing the original PRIDE database. Here we summarize the developments in PRIDE resources and related tools since the previous update manuscript in the Database Issue in 2013. PRIDE Archive constitutes a complete redevelopment of the original PRIDE, comprising a new storage backend, data submission system and web interface, among other components. PRIDE Archive supports the most-widely used PSI (Proteomics Standards Initiative) data standard formats (mzML and mzIdentML) and implements the data requirements and guidelines of the ProteomeXchange Consortium. The wide adoption of ProteomeXchange within the community has triggered an unprecedented increase in the number of submitted data sets (around 150 data sets per month). We outline some statistics on the current PRIDE Archive data contents. We also report on the status of the PRIDE related stand-alone tools: PRIDE Inspector, PRIDE Converter 2 and the ProteomeXchange submission tool. Finally, we will give a brief update on the resources under development 'PRIDE Cluster' and 'PRIDE Proteomes', which provide a complementary view and quality-scored information of the peptide and protein identification data available in PRIDE Archive.

[1]  M. Mann,et al.  Widespread Proteome Remodeling and Aggregation in Aging C. elegans , 2017, Cell.

[2]  J. Vizcaíno,et al.  Exploring the potential of public proteomics data , 2015, Proteomics.

[3]  C. Borchers,et al.  An extensive library of surrogate peptides for all human proteins. , 2015, Journal of proteomics.

[4]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[5]  A. Nesvizhskii,et al.  Metrics for the Human Proteome Project 2015: Progress on the Human Proteome and Guidelines for High-Confidence Protein Identification. , 2015, Journal of proteome research.

[6]  James T. Elder,et al.  Proteogenomic analysis of psoriasis reveals discordant and concordant changes in mRNA and protein abundance , 2015, Genome Medicine.

[7]  Luis Mendoza,et al.  Trans‐Proteomic Pipeline, a standardized data processing pipeline for large‐scale reproducible proteomics informatics , 2015, Proteomics. Clinical applications.

[8]  Samuel T. Turvey,et al.  Ancient proteins resolve the evolutionary history of Darwin’s South American ungulates , 2015, Nature.

[9]  Rory Stark,et al.  Progesterone receptor modulates estrogen receptor-α action in breast cancer , 2015, Nature.

[10]  Mark P. Waldrop,et al.  Multi-omics of permafrost, active layer and thermokarst bog soil microbiomes , 2015, Nature.

[11]  M. Mann,et al.  Widespread Proteome Remodeling and Aggregation in Aging C. elegans , 2015, Cell.

[12]  Jürgen Cox,et al.  Proteomics reveals dynamic assembly of repair complexes during bypass of DNA cross-links , 2015, Science.

[13]  Juan Antonio Vizcaíno,et al.  ms-data-core-api: an open-source, metadata-oriented library for computational proteomics , 2015, Bioinform..

[14]  Juan Antonio Vizcaíno,et al.  Introducing the PRIDE Archive RESTful web services , 2015, Nucleic Acids Res..

[15]  Damian Szklarczyk,et al.  Version 4.0 of PaxDb: Protein abundance data, integrated across model organisms, tissues, and cell‐lines , 2015, Proteomics.

[16]  Yasset Perez-Riverol,et al.  Making proteomics data accessible and reusable: Current state of proteomics databases and repositories , 2015, Proteomics.

[17]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[18]  Johannes Griss,et al.  Identifying novel biomarkers through data mining—A realistic scenario? , 2015, Proteomics. Clinical applications.

[19]  Ying Zhang,et al.  The neXtProt knowledgebase on human proteins: current status , 2014, Nucleic Acids Res..

[20]  Eugene Kolker,et al.  Beyond protein expression, MOPED goes multi-omics , 2014, Nucleic Acids Res..

[21]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[22]  C. Caldas,et al.  Progesterone receptor modulates ERa action in breast cancer , 2015 .

[23]  Eystein Oveland,et al.  PeptideShaker enables reanalysis of MS-derived proteomics data sets , 2015, Nature Biotechnology.

[24]  Juan Antonio Vizcaíno,et al.  A public repository for mass spectrometry imaging data , 2014, Analytical and Bioanalytical Chemistry.

[25]  Pavel A. Pevzner,et al.  Universal database search tool for proteomics , 2014, Nature Communications.

[26]  Juan Antonio Vizcaíno,et al.  How to submit MS proteomics data to ProteomeXchange via the PRIDE database , 2014, Proteomics.

[27]  M. Tress,et al.  Analyzing the First Drafts of the Human Proteome , 2014, Journal of proteome research.

[28]  Jun Fan,et al.  The mzTab Data Exchange Format: Communicating Mass-spectrometry-based Proteomics and Metabolomics Experimental Results to a Wider Audience* , 2014, Molecular & Cellular Proteomics.

[29]  Yassene Mohammed,et al.  PeptidePicker: a scientific workflow with web interface for selecting appropriate peptides for targeted proteomics experiments. , 2014, Journal of proteomics.

[30]  B. Kuster,et al.  Mass-spectrometry-based draft of the human proteome , 2014, Nature.

[31]  Gary D Bader,et al.  A draft map of the human proteome , 2014, Nature.

[32]  Johannes Griss,et al.  jmzTab: A Java interface to the mzTab data standard , 2014, Proteomics.

[33]  Lennart Martens,et al.  qcML: An Exchange Format for Quality Control Metrics from Mass Spectrometry Experiments , 2014, Molecular & Cellular Proteomics.

[34]  Andrew R. Jones,et al.  ProteomeXchange provides globally co-ordinated proteomics data submission and dissemination , 2014, Nature Biotechnology.

[35]  Lennart Martens,et al.  MS2PIP: a tool for MS/MS peak intensity prediction , 2013, Bioinform..

[36]  Martin Eisenacher,et al.  The mzQuantML Data Standard for Mass Spectrometry–based Quantitative Studies in Proteomics , 2013, Molecular & Cellular Proteomics.

[37]  Markus Müller,et al.  EasyProt--an easy-to-use graphical platform for proteomics data analysis. , 2013, Journal of proteomics.

[38]  Johannes Griss,et al.  PRIDE Cluster: building a consensus of proteomics data , 2013, Nature Methods.

[39]  Johannes Griss,et al.  The Proteomics Identifications (PRIDE) database and associated tools: status in 2013 , 2012, Nucleic Acids Res..

[40]  Conrad Bessant,et al.  MRMaid 2.0: mining PRIDE for evidence-based SRM transitions. , 2012, Omics : a journal of integrative biology.

[41]  Lennart Martens,et al.  The PRoteomics IDEntification (PRIDE) Converter 2 Framework: An Improved Suite of Tools to Facilitate Data Submission to the PRIDE Database and the ProteomeXchange Consortium , 2012, Molecular & Cellular Proteomics.

[42]  Ivan Matic,et al.  Reanalysis of phosphoproteomics data uncovers ADP-ribosylation sites , 2012, Nature Methods.

[43]  Luis Mendoza,et al.  PASSEL: The PeptideAtlas SRMexperiment library , 2012, Proteomics.

[44]  Juan Antonio Vizcaíno,et al.  jmzIdentML API: A Java interface to the mzIdentML standard for peptide and protein identification data , 2012, Proteomics.

[45]  Johannes Griss,et al.  jmzReader: A Java parser library to process and visualize multiple text and XML-based mass spectrometry data formats , 2012, Proteomics.

[46]  Martin Eisenacher,et al.  The mzIdentML Data Standard for Mass Spectrometry-Based Proteomics Results , 2012, Molecular & Cellular Proteomics.

[47]  Rui Wang,et al.  PRIDE: Quality control in a proteomics data repository , 2012, Database J. Biol. Databases Curation.

[48]  Albert Sickmann,et al.  Systematic and quantitative comparison of digest efficiency and specificity reveals the impact of trypsin quality on MS-based proteomics. , 2012, Journal of proteomics.

[49]  Matthias Mann,et al.  Analysis of High Accuracy, Quantitative Proteomics Data in the MaxQB Database , 2012, Molecular & Cellular Proteomics.

[50]  Lennart Martens,et al.  PRIDE Inspector: a tool to visualize and validate MS proteomics data , 2011, Nature Biotechnology.

[51]  Lincoln Stein,et al.  Reactome pathway analysis to enrich biological discovery in proteomics data sets , 2011, Proteomics.

[52]  Johannes Griss,et al.  Published and Perished? The Influence of the Searched Protein Database on the Long-Term Storage of Proteomics Data , 2011, Molecular & Cellular Proteomics.

[53]  Lennart Martens,et al.  mzML—a Community Standard for Mass Spectrometry Data* , 2010, Molecular & Cellular Proteomics.

[54]  Lennart Martens,et al.  jmzML, an open‐source Java API for mzML, the PSI standard for MS data , 2010, Proteomics.

[55]  Lennart Martens,et al.  The Proteomics Identifications database: 2010 update , 2009, Nucleic Acids Res..

[56]  Lennart Martens,et al.  PRIDE Converter: making proteomics data-sharing easy , 2009, Nature Biotechnology.

[57]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[58]  Henry H. N. Lam,et al.  PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows , 2008, EMBO reports.

[59]  Lennart Martens,et al.  PRIDE: new developments and new datasets , 2007, Nucleic Acids Res..

[60]  D. Tabb,et al.  MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. , 2007, Journal of proteome research.

[61]  Lennart Martens,et al.  PRIDE: a public repository of protein and peptide identifications for the proteomics community , 2005, Nucleic Acids Res..

[62]  Lennart Martens,et al.  PRIDE: The proteomics identifications database , 2005, Proteomics.

[63]  Robertson Craig,et al.  Open source system for analyzing, validating, and storing protein identification data. , 2004, Journal of proteome research.

[64]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[65]  A. D. Lunn,et al.  The Data Sets , 1994 .