The Proteomics Identifications database: 2010 update

The Proteomics Identifications database (PRIDE, http://www.ebi.ac.uk/pride) at the European Bioinformatics Institute has become one of the main repositories of mass spectrometry-derived proteomics data. For the last 2 years, PRIDE data holdings have grown substantially, comprising 60 different species, more than 2.5 million protein identifications, 11.5 million peptides and over 50 million spectra by September 2009. We here describe several new and improved features in PRIDE, including the revised submission process, which now includes direct submission of fragment ion annotations. Correspondingly, it is now possible to visualize spectrum fragmentation annotations on tandem mass spectra, a key feature for compliance with journal data submission requirements. We also describe recent developments in the PRIDE BioMart interface, which now allows integrative queries that can join PRIDE data to a growing number of biological resources such as Reactome, Ensembl, InterPro and UniProt. This ability to perform extremely powerful across-domain queries will certainly be a cornerstone of future bioinformatics analyses. Finally, we highlight the importance of data sharing in the proteomics field, and the corresponding integration of PRIDE with other databases in the ProteomExchange consortium.

[1]  Andrew M. Jenkinson,et al.  Ensembl 2009 , 2008, Nucleic Acids Res..

[2]  Lennart Martens,et al.  PRIDE: a public repository of protein and peptide identifications for the proteomics community , 2005, Nucleic Acids Res..

[3]  F. Reisinger,et al.  Database on Demand – An online tool for the custom generation of FASTA‐formatted sequence databases , 2009, Proteomics.

[4]  Robertson Craig,et al.  Open source system for analyzing, validating, and storing protein identification data. , 2004, Journal of proteome research.

[5]  Credit where credit is overdue , 2009, Nature Biotechnology.

[6]  Kumaran Kandasamy,et al.  Human Proteinpedia: a unified discovery resource for proteomics research , 2008, Nucleic Acids Res..

[7]  Lennart Martens,et al.  Using the Proteomics Identifications Database (PRIDE) , 2008, Current protocols in bioinformatics.

[8]  Lennart Martens,et al.  ms_lims, a simple yet powerful open source laboratory information management system for MS‐driven proteomics , 2010, Proteomics.

[9]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[10]  Ronald J Moore,et al.  Human plasma N-glycoproteome analysis by immunoaffinity subtraction, hydrazide chemistry, and mass spectrometry. , 2005, Journal of proteome research.

[11]  Jennifer A Mead,et al.  Recent developments in public proteomic MS repositories and pipelines , 2009, Proteomics.

[12]  John M. Asara,et al.  Response to Comment on "Protein Sequences from Mastodon and Tyrannosaurus rex Revealed by Mass Spectrometry" , 2008, Science.

[13]  Democratizing proteomics data , 2007, Nature Biotechnology.

[14]  Lennart Martens,et al.  Analyzing large-scale proteomics projects with latent semantic indexing. , 2008, Journal of proteome research.

[15]  Lennart Martens,et al.  A la carte proteomics with an emphasis on gel‐free techniques , 2007, Proteomics.

[16]  Ronald G. Tompkins,et al.  High Dynamic Range Characterization of the Trauma Patient Plasma Proteome*S , 2006, Molecular & Cellular Proteomics.

[17]  Jennifer A Mead,et al.  Public proteomic MS repositories and pipelines: available tools and biological applications , 2007, Proteomics.

[18]  Thou shalt share your data , 2008, Nature Methods.

[19]  Marina Chicurel Bioinformatics: bringing it all together. , 2002, Nature.

[20]  Henry H. N. Lam,et al.  PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows , 2008, EMBO reports.

[21]  Lennart Martens,et al.  The Ontology Lookup Service: more data and better tools for controlled vocabulary queries , 2008, Nucleic Acids Res..

[22]  Lennart Martens,et al.  The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases , 2007, BMC Bioinformatics.

[23]  R. Aebersold,et al.  Comparative Functional Analysis of the Caenorhabditis elegans and Drosophila melanogaster Proteomes , 2009, PLoS biology.

[24]  Richard Côté,et al.  The PRIDE proteomics identifications database: data submission, query, and dataset comparison. , 2008, Methods in molecular biology.

[25]  Melinda R. Dwinell,et al.  The Rat Genome Database 2009: variation, ontologies and pathways , 2008, Nucleic Acids Res..

[26]  Lennart Martens,et al.  PRIDE Converter: making proteomics data-sharing easy , 2009, Nature Biotechnology.

[27]  E. Birney,et al.  The International Protein Index: An integrated database for proteomics experiments , 2004, Proteomics.

[28]  Lennart Martens,et al.  Annotating the human proteome: beyond establishing a parts list. , 2007, Biochimica et biophysica acta.

[29]  Lennart Martens,et al.  PRIDE: The proteomics identifications database , 2005, Proteomics.

[30]  Sameer Velankar,et al.  E-MSD: improving data deposition and structure quality , 2005, Nucleic Acids Res..

[31]  Marina Chicurel,et al.  Bioinformatics: Bringing it all together technology feature , 2002, Nature.

[32]  Lennart Martens,et al.  PRIDE: new developments and new datasets , 2007, Nucleic Acids Res..

[33]  Damian Smedley,et al.  BioMart – biological queries made easy , 2009, BMC Genomics.

[34]  Rolf Apweiler,et al.  The Proteomics Identifications Database (PRIDE) and the ProteomExchange Consortium: making proteomics data accessible , 2006, Expert review of proteomics.

[35]  Ron Edgar,et al.  NCBI Peptidome: a new public repository for mass spectrometry peptide identifications , 2009, Nature Biotechnology.

[36]  Lennart Martens,et al.  A guide to the Proteomics Identifications Database proteomics data repository , 2009, Proteomics.

[37]  Robert E. Kearney,et al.  Quantitative Proteomics Analysis of the Secretory Pathway , 2006, Cell.

[38]  W. Pearson,et al.  Current Protocols in Bioinformatics , 2002 .

[39]  S. Carr,et al.  Reporting Protein Identification Data , 2006, Molecular & Cellular Proteomics.

[40]  Lincoln Stein,et al.  Reactome knowledgebase of human biological pathways and processes , 2008, Nucleic Acids Res..

[41]  R. Aebersold,et al.  7 th HUPO World Congress of Proteomics: Launching the Second Phase of the HUPO Plasma Proteome Project (PPP-2) , 2009 .

[42]  Lennart Martens,et al.  A comparison of the HUPO Brain Proteome Project pilot with other proteomics studies , 2006, Proteomics.

[43]  Ruedi Aebersold,et al.  7th HUPO World Congress of Proteomics: Launching the Second Phase of the HUPO Plasma Proteome Project (PPP‐2) 16–20 August 2008, Amsterdam, The Netherlands , 2009, Proteomics.

[44]  Michael J MacCoss,et al.  Use of shotgun proteomics for the identification, confirmation, and correction of C. elegans gene annotations. , 2008, Genome research.

[45]  Albert Sickmann,et al.  The proteome of Saccharomyces cerevisiae mitochondria , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[46]  P. Zimmermann,et al.  Genome-Scale Proteomics Reveals Arabidopsis thaliana Gene Models and Proteome Dynamics , 2008, Science.

[47]  The UniProt Consortium,et al.  The Universal Protein Resource (UniProt) 2009 , 2008, Nucleic Acids Res..