Human Proteome Project Mass Spectrometry Data Interpretation Guidelines 2.1.

Every data-rich community research effort requires a clear plan for ensuring the quality of the data interpretation and comparability of analyses. To address this need within the Human Proteome Project (HPP) of the Human Proteome Organization (HUPO), we have developed through broad consultation a set of mass spectrometry data interpretation guidelines that should be applied to all HPP data contributions. For submission of manuscripts reporting HPP protein identification results, the guidelines are presented as a one-page checklist containing 15 essential points followed by two pages of expanded description of each. Here we present an overview of the guidelines and provide an in-depth description of each of the 15 elements to facilitate understanding of the intentions and rationale behind the guidelines, for both authors and reviewers. Broadly, these guidelines provide specific directions regarding how HPP data are to be submitted to mass spectrometry data repositories, how error analysis should be presented, and how detection of novel proteins should be supported with additional confirmatory evidence. These guidelines, developed by the HPP community, are presented to the broader scientific community for further discussion.

[1]  Lennart Martens,et al.  PRIDE: The proteomics identifications database , 2005, Proteomics.

[2]  Hyungwon Choi,et al.  False discovery rates and related statistical concepts in mass spectrometry-based proteomics. , 2008, Journal of proteome research.

[3]  A. Makarov,et al.  Discrimination of leucine and isoleucine in peptides sequencing with Orbitrap Fusion mass spectrometer. , 2014, Analytical chemistry.

[4]  Susan E. Abbatiello,et al.  Targeted Peptide Measurements in Biology and Medicine: Best Practices for Mass Spectrometry-based Assay Development Using a Fit-for-Purpose Approach* , 2014, Molecular & Cellular Proteomics.

[5]  R. Aebersold,et al.  A statistical model for identifying proteins by tandem mass spectrometry. , 2003, Analytical chemistry.

[6]  G. Damonte,et al.  How to discriminate between leucine and isoleucine by low energy ESI-TRAP MSn , 2007, Journal of the American Society for Mass Spectrometry.

[7]  Ying Zhang,et al.  The neXtProt knowledgebase on human proteins: current status , 2014, Nucleic Acids Res..

[8]  John R Yates,et al.  Mass spectrometry in high-throughput proteomics: ready for the big time , 2010, Nature Methods.

[9]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[10]  Ludovic C. Gillet,et al.  Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis* , 2012, Molecular & Cellular Proteomics.

[11]  U. Eckhard,et al.  Protein Termini and Their Modifications Revealed by Positional Proteomics. , 2015, ACS chemical biology.

[12]  A. Nesvizhskii Proteogenomics: concepts, applications and computational strategies , 2014, Nature Methods.

[13]  Steven P Gygi,et al.  Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry , 2007, Nature Methods.

[14]  William S Hancock,et al.  Recent Advances in the Chromosome-Centric Human Proteome Project: Missing Proteins in the Spot Light. , 2015, Journal of proteome research.

[15]  David D. Shteynberg,et al.  State of the Human Proteome in 2014/2015 As Viewed through PeptideAtlas: Enhancing Accuracy and Coverage through the AtlasProphet. , 2015, Journal of proteome research.

[16]  Martin Eisenacher,et al.  The mzIdentML Data Standard for Mass Spectrometry-Based Proteomics Results , 2012, Molecular & Cellular Proteomics.

[17]  S. Carr,et al.  Reporting Protein Identification Data , 2006, Molecular & Cellular Proteomics.

[18]  Mathias Wilhelm,et al.  A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets , 2015, Molecular & Cellular Proteomics.

[19]  A. Nesvizhskii A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. , 2010, Journal of proteomics.

[20]  Nichole L. King,et al.  The PeptideAtlas Project , 2010, Proteome Bioinformatics.

[21]  R. Aebersold,et al.  Selected reaction monitoring–based proteomics: workflows, potential, pitfalls and future directions , 2012, Nature Methods.

[22]  Martin Eisenacher,et al.  Development of data representation standards by the human proteome organization proteomics standards initiative , 2015, J. Am. Medical Informatics Assoc..

[23]  Nichole L. King,et al.  Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry , 2004, Genome Biology.

[24]  Andrew R. Jones,et al.  ProteomeXchange provides globally co-ordinated proteomics data submission and dissemination , 2014, Nature Biotechnology.

[25]  Li Ding,et al.  An Analysis of the Sensitivity of Proteogenomic Mapping of Somatic Mutations and Novel Splicing Events in Cancer* , 2015, Molecular & Cellular Proteomics.

[26]  Lennart Martens,et al.  The minimum information about a proteomics experiment (MIAPE) , 2007, Nature Biotechnology.

[27]  Henry H. N. Lam,et al.  Data analysis and bioinformatics tools for tandem mass spectrometry in proteomics. , 2008, Physiological genomics.

[28]  Luis Mendoza,et al.  PASSEL: The PeptideAtlas SRMexperiment library , 2012, Proteomics.

[29]  Juan Antonio Vizcaíno,et al.  Quest for Missing Proteins: Update 2015 on Chromosome-Centric Human Proteome Project. , 2015, Journal of proteome research.

[30]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[31]  Steven A. Carr,et al.  On Credibility, Clarity, and Compliance , 2015, Molecular & Cellular Proteomics.

[32]  S. Hanash,et al.  The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome , 2012, Nature Biotechnology.

[33]  Theodoros Goulas,et al.  LysargiNase mirrors trypsin for protein C-terminal and methylation-site identification , 2014, Nature Methods.

[34]  Juan Antonio Vizcaíno,et al.  How to submit MS proteomics data to ProteomeXchange via the PRIDE database , 2014, Proteomics.

[35]  Joshua E. Elias,et al.  Target-Decoy Search Strategy for Mass Spectrometry-Based Proteomics , 2010, Proteome Bioinformatics.

[36]  Cathy H. Wu,et al.  The Human Proteome Project: Current State and Future Direction , 2011, Molecular & Cellular Proteomics.

[37]  Lennart Martens,et al.  Analysis of the resolution limitations of peptide identification algorithms. , 2011, Journal of proteome research.

[38]  A. Nesvizhskii,et al.  Metrics for the Human Proteome Project 2015: Progress on the Human Proteome and Guidelines for High-Confidence Protein Identification. , 2015, Journal of proteome research.