Data management and data integration in the HUPO plasma proteome project.

The Human Plasma Proteome Project (HPPP) is an international collaboration coordinated by the Human Proteome Organisation (HUPO). Its Pilot Phase generated the 2005 Proteomics special issue "Exploring the Human Plasma Proteome" (Omenn et al. Proteomics 5:3226-3245, 2005) and a book with the same title (Omenn GS (ed) (2006) Exploring the human plasma proteome. Wiley-Liss, Weinheim, pp 372). Data management for that Pilot Phase included collection, integration, analysis, and dissemination of findings from participating laboratories and data repositories. Many investigators face the same challenges of integration of data from complex, dynamic serum, and plasma specimens. The PPP workflow assembled a representative Core Dataset of 3,020 protein identifications, overcoming ambiguity and redundancy in the heterogeneous contributed identifications and redundancy and updates in the protein sequence databases. The results were made available with alternative thresholds from the University of Michigan, yielding a range of numbers of protein identifications. Data were submitted to EBI/PRIDE and to ISB/PeptideAtlas. The current phase of the PPP employs Proteome Xchange to link submission of well-annotated primary datasets to EBI/PRIDE, distributed file sharing by Tranche/Proteome Commons.org, and reanalysis from the primary raw spectra at ISB/PeptideAtlas. Such human plasma proteome datasets are available for data mining comparisons with the proteomes of other organs and biofluids in health and disease.

[1]  Nichole L. King,et al.  Development and validation of a spectral library searching method for peptide identification from MS/MS , 2007, Proteomics.

[2]  Nichole L. King,et al.  Human Plasma PeptideAtlas , 2005, Proteomics.

[3]  Graham B. I. Scott,et al.  HUPO Plasma Proteome Project specimen collection and handling: Towards the standardization of parameters for plasma proteome samples , 2005, Proteomics.

[4]  Gilbert S Omenn,et al.  An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: Sensitivity and specificity analysis , 2005, Proteomics.

[5]  Eric W. Deutsch,et al.  The PeptideAtlas project , 2005, Nucleic Acids Res..

[6]  G. Omenn,et al.  Exploring the Human Plasma Proteome , 2005, Proteomics.

[7]  Eugene A. Kapp,et al.  Overview of the HUPO Plasma Proteome Project: Results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly‐available database , 2005, Proteomics.

[8]  D. States,et al.  The Human Plasma and Serum Proteome , 2007 .

[9]  E. Birney,et al.  The International Protein Index: An integrated database for proteomics experiments , 2004, Proteomics.

[10]  Rolf Apweiler,et al.  Systematic comparison of the human saliva and plasma proteomes , 2009, Proteomics. Clinical applications.

[11]  R. Aebersold,et al.  HUPO Plasma Proteome Project 2007 Workshop Report , 2007, Molecular & Cellular Proteomics.

[12]  Lennart Martens,et al.  HUPO Brain Proteome Project: Summary of the pilot phase and introduction of a comprehensive data reprocessing strategy , 2006, Proteomics.

[13]  R. Aebersold,et al.  7 th HUPO World Congress of Proteomics: Launching the Second Phase of the HUPO Plasma Proteome Project (PPP-2) , 2009 .

[14]  Henry H. N. Lam,et al.  PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows , 2008, EMBO reports.

[15]  S. Hanash,et al.  Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study , 2006, Nature Biotechnology.

[16]  Ruedi Aebersold,et al.  7th HUPO World Congress of Proteomics: Launching the Second Phase of the HUPO Plasma Proteome Project (PPP‐2) 16–20 August 2008, Amsterdam, The Netherlands , 2009, Proteomics.

[17]  Chris F. Taylor,et al.  Data management and preliminary data analysis in the pilot phase of the HUPO Plasma Proteome Project , 2005, Proteomics.

[18]  Nichole L. King,et al.  The PeptideAtlas Project , 2010, Proteome Bioinformatics.

[19]  R. Aebersold,et al.  A statistical model for identifying proteins by tandem mass spectrometry. , 2003, Analytical chemistry.

[20]  Ruedi Aebersold,et al.  The Need for Guidelines in Publication of Peptide and Protein Identification Data , 2004, Molecular & Cellular Proteomics.

[21]  Gilbert S Omenn,et al.  Immunoassay and antibody microarray analysis of the HUPO Plasma Proteome Project reference specimens: Systematic variation between sample types and calibration of mass spectrometry data , 2005, Proteomics.