Recommendations for mass spectrometry data quality metrics for open access data (corollary to the Amsterdam principles)

Policies supporting the rapid and open sharing of proteomic data are being implemented by the leading journals in the field. The proteomics community is taking steps to ensure that data are made publicly accessible and are of high quality, a challenging task that requires the development and deployment of methods for measuring and documenting data quality metrics. On September 18, 2010, the U.S. National Cancer Institute (NCI) convened the “International Workshop on Proteomic Data Quality Metrics” in Sydney, Australia, to identify and address issues facing the development and use of such methods for open access proteomics data. The stakeholders at the workshop enumerated the key principles underlying a framework for data quality assessment in mass spectrometry data that will meet the needs of the research community, journals, funding agencies, and data repositories. Attendees discussed and agreed up on two primary needs for the wide use of quality metrics: (i) an evolving list of comprehensive quality metrics and (ii) standards accompanied by software analytics. Attendees stressed the importance of increased education and training programs to promote reliable protocols in proteomics. This workshop report explores the historic precedents, key discussions, and necessary next steps to enhance the quality of open access data. By agreement, this article is published simultaneously in Proteomics, Proteomics Clinical Applications, Journal of Proteome Research, and Molecular and Cellular Proteomics, as a public service to the research community. The peer review process was a coordinated effort conducted by a panel of referees selected by the journals.

[1]  Bingwen Lu,et al.  Automatic validation of phosphopeptide identifications from tandem mass spectra. , 2007, Analytical chemistry.

[2]  Lennart Martens,et al.  mzML—a Community Standard for Mass Spectrometry Data* , 2010, Molecular & Cellular Proteomics.

[3]  James A Hill,et al.  Proteomics FASTA Archive and Reference Resource , 2008, Proteomics.

[4]  Christoph H Borchers,et al.  Multi-site assessment of the precision and reproducibility of multiple reaction monitoring–based measurements of proteins in plasma , 2009, Nature Biotechnology.

[5]  Cathy H. Wu,et al.  The Human Proteome Project: Current State and Future Direction , 2011, Molecular & Cellular Proteomics.

[6]  Phil Andrews,et al.  Recommendations from the 2008 International Summit on Proteomics Data Release and Sharing Policy: the Amsterdam principles. , 2009, Journal of proteome research.

[7]  John R. Yates,et al.  Census for Proteome Quantification , 2010, Current protocols in bioinformatics.

[8]  Ruedi Aebersold,et al.  The Need for Guidelines in Publication of Peptide and Protein Identification Data , 2004, Molecular & Cellular Proteomics.

[9]  David L. Tabb,et al.  Performance Metrics for Liquid Chromatography-Tandem Mass Spectrometry Systems in Proteomics Analyses* , 2009, Molecular & Cellular Proteomics.

[10]  Steven A. Carr,et al.  New Guidelines for Clinical Proteomics Manuscripts , 2008, Molecular & Cellular Proteomics.

[11]  R. Aebersold,et al.  mProphet: automated data processing and statistical validation for large-scale SRM experiments , 2011, Nature Methods.

[12]  Ralph A Bradshaw Revised draft guidelines for proteomic data publication. , 2005, Molecular & cellular proteomics : MCP.

[13]  E. Dooley National Institute of Neurological Disorders and Stroke , 2006 .

[14]  R. Aebersold,et al.  A statistical model for identifying proteins by tandem mass spectrometry. , 2003, Analytical chemistry.

[15]  Michael D. Litton,et al.  IDPicker 2.0: Improved protein assembly with high discrimination peptide identification filtering. , 2009, Journal of proteome research.

[16]  Robertson Craig,et al.  Open source system for analyzing, validating, and storing protein identification data. , 2004, Journal of proteome research.

[17]  Birgit Schilling,et al.  Interlaboratory Study Characterizing a Yeast Performance Standard for Benchmarking LC-MS Platform Performance* , 2009, Molecular & Cellular Proteomics.

[18]  William Stafford Noble,et al.  Statistical calibration of the SEQUEST XCorr function. , 2009, Journal of proteome research.

[19]  Robert D. Wells The Birth of Molecular & Cellular Proteomics , 2002, Molecular & Cellular Proteomics.

[20]  Lennart Martens,et al.  The minimum information about a proteomics experiment (MIAPE) , 2007, Nature Biotechnology.

[21]  Heidi Zhang,et al.  Integrated pipeline for mass spectrometry-based discovery and confirmation of biomarkers demonstrated in a mouse model of breast cancer. , 2007, Journal of proteome research.

[22]  Ruedi Aebersold,et al.  The standard protein mix database: a diverse data set to assist in the production of improved Peptide and protein identification software tools. , 2008, Journal of proteome research.

[23]  Lennart Martens,et al.  The Proteomics Identifications database: 2010 update , 2009, Nucleic Acids Res..

[24]  E. Deutsch mzML: A single, unifying data format for mass spectrometer output , 2008, Proteomics.

[25]  Henry H. N. Lam,et al.  PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows , 2008, EMBO reports.

[26]  John R Yates,et al.  Colander: a probability-based support vector machine algorithm for automatic screening for CID spectra of phosphopeptides prior to database search. , 2008, Journal of proteome research.

[27]  Gilbert S Omenn,et al.  The Human Proteome Organization Plasma Proteome Project pilot phase: Reference specimens, technology platform comparisons, and standardized data submissions and analyses , 2004, Proteomics.

[28]  Susan E Abbatiello,et al.  Automated detection of inaccurate and imprecise transitions in peptide quantification by multiple reaction monitoring mass spectrometry. , 2010, Clinical chemistry.

[29]  Birgit Schilling,et al.  Repeatability and reproducibility in proteomic identifications by liquid chromatography-tandem mass spectrometry. , 2010, Journal of proteome research.

[30]  J. Yates,et al.  A model for random sampling and estimation of relative protein abundance in shotgun proteomics. , 2004, Analytical chemistry.

[31]  William Stafford Noble,et al.  Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. , 2008, Journal of proteome research.

[32]  R. Appel,et al.  Guidelines for the next 10 years of proteomics , 2009, Proteomics.

[33]  R. Aebersold,et al.  Dynamic Spectrum Quality Assessment and Iterative Computational Analysis of Shotgun Proteomic Data , 2006, Molecular & Cellular Proteomics.

[34]  Rolf Apweiler,et al.  The Proteomics Identifications Database (PRIDE) and the ProteomExchange Consortium: making proteomics data accessible , 2006, Expert review of proteomics.

[35]  R. Aebersold,et al.  A High-Confidence Human Plasma Proteome Reference Set with Estimated Concentrations in PeptideAtlas* , 2011, Molecular & Cellular Proteomics.

[36]  Martin Eisenacher,et al.  Implementing Data Standards: A report on the HUPOPSI Workshop September 2009, Toronto, Canada , 2010, Proteomics.

[37]  J. Yates,et al.  DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics. , 2002, Journal of proteome research.

[38]  William Stafford Noble,et al.  Semi-supervised learning for peptide identification from shotgun proteomics datasets , 2007, Nature Methods.

[39]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[40]  Ruedi Aebersold,et al.  High-throughput generation of selected reaction-monitoring assays for proteins and proteomes , 2010, Nature Methods.

[41]  Mehdi Mesri,et al.  Evolution of clinical proteomics and its role in medicine. , 2011, Journal of proteome research.

[42]  C. Turck,et al.  The Association of Biomolecular Resource Facilities Proteomics Research Group 2006 Study , 2007, Molecular & Cellular Proteomics.

[43]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[44]  James A Hill,et al.  ProteomeCommons.org collaborative annotation and project management resource integrated with the Tranche repository. , 2010, Journal of proteome research.

[45]  Maureen Kachman,et al.  Validated MALDI-TOF/TOF mass spectra for protein standards , 2007, Journal of the American Society for Mass Spectrometry.

[46]  B. Garcia,et al.  Proteomics , 2011, Journal of biomedicine & biotechnology.

[47]  K. Parker,et al.  Multiplexed Protein Quantitation in Saccharomyces cerevisiae Using Amine-reactive Isobaric Tagging Reagents*S , 2004, Molecular & Cellular Proteomics.