Human Proteome Project Mass Spectrometry Data Interpretation Guidelines 3.0

The Human Proteome Organization’s (HUPO) Human Proteome Project (HPP) developed Mass Spectrometry (MS) Data Interpretation Guidelines that have been applied since 2016. These guidelines have helped ensure that the emerging draft of the complete human proteome is highly accurate and with low numbers of false-positive protein identifications. Here, we describe an update to these guidelines based on discussions with the wider HPP community over the past year. The revised Guidelines 3.0 address several major and minor identified gaps. The main checklist has been reorganized under headings and subitems and related guidelines have been grouped. We have added guidelines for emerging data independent acquisition (DIA) MS workflows and for use of the new Universal Spectrum Identifier (USI) system being developed by the HUPO Proteomics Standards Initiative (PSI). In addition, we discuss updates to the standard HPP pipeline for collecting MS evidence for all proteins in the HPP, including refinements to minimum evidence. We present a new plan for incorporating MassIVE-KB into the HPP pipeline for the next (HPP 2020) cycle in order to obtain more comprehensive coverage of public MS data sets. In sum, Version 2.1 of the HPP MS Data Interpretation Guidelines has served well and this timely update version 3.0 will aid the HPP as it approaches its goal of collecting and curating MS evidence of translation and expression for all predicted ∼20,000 human proteins encoded by the human genome.

[1]  Nichole L. King,et al.  The PeptideAtlas Project , 2010, Proteome Bioinformatics.

[2]  Luis Mendoza,et al.  Flexible and Fast Mapping of Peptides to a Proteome with ProteoMapper. , 2018, Journal of proteome research.

[3]  Nuno Bandeira,et al.  ProteinExplorer: A Repository-Scale Resource for Exploration of Protein Detection in Public Mass Spectrometry Data Sets. , 2018, Journal of proteome research.

[4]  Natalie I. Tasman,et al.  A guided tour of the Trans‐Proteomic Pipeline , 2010, Proteomics.

[5]  S. Hanash,et al.  The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome , 2012, Nature Biotechnology.

[6]  N. Rauniyar,et al.  Parallel Reaction Monitoring: A Targeted Experiment Performed Using High Resolution and High Mass Accuracy Mass Spectrometry , 2015, International journal of molecular sciences.

[7]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[8]  Michael J. MacCoss,et al.  Initial Guidelines for Manuscripts Employing Data-independent Acquisition Mass Spectrometry for Proteomic Analysis , 2019, Molecular & Cellular Proteomics.

[9]  M. Tress,et al.  Analyzing the First Drafts of the Human Proteome , 2014, Journal of proteome research.

[10]  Theodoros Goulas,et al.  LysargiNase mirrors trypsin for protein C-terminal and methylation-site identification , 2014, Nature Methods.

[11]  Amos Bairoch,et al.  Metrics for the Human Proteome Project 2013-2014 and strategies for finding missing proteins. , 2014, Journal of proteome research.

[12]  A. Nesvizhskii,et al.  Metrics for the Human Proteome Project 2015: Progress on the Human Proteome and Guidelines for High-Confidence Protein Identification. , 2015, Journal of proteome research.

[13]  Masaki Matsumoto,et al.  The jPOST environment: an integrated proteomics data repository and database , 2018, Nucleic Acids Res..

[14]  Lennart Martens,et al.  Human Proteome Project Mass Spectrometry Data Interpretation Guidelines 2.1. , 2016, Journal of proteome research.

[15]  Mingming Jia,et al.  COSMIC: somatic cancer genetics at high-resolution , 2016, Nucleic Acids Res..

[16]  Chih-Chiang Tsou,et al.  DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics , 2015, Nature Methods.

[17]  Rolf Apweiler,et al.  The Proteomics Standards Initiative , 2003, Proteomics.

[18]  Nichole L. King,et al.  Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry , 2004, Genome Biology.

[19]  Hilla Peretz,et al.  The , 1966 .

[20]  Loïc Dayon,et al.  Deep Dive on the Proteome of Human Cerebrospinal Fluid: A Valuable Data Resource for Biomarker Discovery and Missing Protein Identification. , 2018, Journal of proteome research.

[21]  G. Omenn,et al.  Progress on Identifying and Characterizing the Human Proteome: 2018-2019 Metrics from the HUPO Human Proteome Project. , 2019, Journal of proteome research.

[22]  David D. Shteynberg,et al.  State of the Human Proteome in 2014/2015 As Viewed through PeptideAtlas: Enhancing Accuracy and Coverage through the AtlasProphet. , 2015, Journal of proteome research.

[23]  Martin Eisenacher,et al.  The PRIDE database and related tools and resources in 2019: improving support for quantification data , 2018, Nucleic Acids Res..

[24]  Juan Antonio Vizcaíno,et al.  The ProteomeXchange consortium in 2017: supporting the cultural change in proteomics public data deposition , 2016, Nucleic Acids Res..

[25]  Lydie Lane,et al.  Progress on Identifying and Characterizing the Human Proteome: 2018 Metrics from the HUPO Human Proteome Project. , 2018, Journal of proteome research.

[26]  Lydie Lane,et al.  Progress on the HUPO Draft Human Proteome: 2017 Metrics of the Human Proteome Project. , 2017, Journal of proteome research.

[27]  Lars J Jensen,et al.  Site-specific mapping of the human SUMO proteome reveals co-modification with phosphorylation , 2017, Nature Structural &Molecular Biology.

[28]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.

[29]  Bo Xu,et al.  Identification and Validation of Human Missing Proteins and Peptides in Public Proteome Databases: Data Mining Strategy. , 2017, Journal of proteome research.

[30]  Martin Eisenacher,et al.  Proteomics Standards Initiative: Fifteen Years of Progress and Future Work , 2017, Journal of proteome research.

[31]  Andrew R. Jones,et al.  ProteomeXchange provides globally co-ordinated proteomics data submission and dissemination , 2014, Nature Biotechnology.

[32]  Luis Mendoza,et al.  Trans‐Proteomic Pipeline, a standardized data processing pipeline for large‐scale reproducible proteomics informatics , 2015, Proteomics. Clinical applications.

[33]  Stefan Tenzer,et al.  Biomedical applications of ion mobility-enhanced data-independent acquisition-based label-free quantitative proteomics , 2014, Expert review of proteomics.

[34]  Ludovic C. Gillet,et al.  Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis* , 2012, Molecular & Cellular Proteomics.

[35]  Chris Sander,et al.  Human SRMAtlas: A Resource of Targeted Assays to Quantify the Complete Human Proteome , 2016, Cell.

[36]  Tao Zhang,et al.  Multi-Protease Strategy Identifies Three PE2 Missing Proteins in Human Testis Tissue. , 2017, Journal of proteome research.

[37]  Lennart Martens,et al.  The Proteomics Identifications database: 2010 update , 2009, Nucleic Acids Res..

[38]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[39]  Mathieu Schaeffer,et al.  The neXtProt peptide uniqueness checker: a tool for the proteomics community , 2017, Bioinform..

[40]  Oliver M. Bernhardt,et al.  Extending the Limits of Quantitative Proteome Profiling with Data-Independent Acquisition and Application to Acetaminophen-Treated Three-Dimensional Liver Microtissues* , 2015, Molecular & Cellular Proteomics.

[41]  R. Aebersold,et al.  Selected reaction monitoring–based proteomics: workflows, potential, pitfalls and future directions , 2012, Nature Methods.

[42]  Lydie Lane,et al.  Metrics for the Human Proteome Project 2016: Progress on Identifying and Characterizing the Human Proteome, Including Post-Translational Modifications. , 2016, Journal of proteome research.

[43]  Jong Shin Yoo,et al.  Identification of Missing Proteins in Human Olfactory Epithelial Tissue by Liquid Chromatography-Tandem Mass Spectrometry. , 2018, Journal of proteome research.

[44]  Ying Zhang,et al.  The neXtProt knowledgebase on human proteins: current status , 2014, Nucleic Acids Res..

[45]  Cathy H. Wu,et al.  The Human Proteome Project: Current State and Future Direction , 2011, Molecular & Cellular Proteomics.

[46]  Ben C. Collins,et al.  OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data , 2014, Nature Biotechnology.

[47]  Chih-Chiang Tsou,et al.  Computational Framework for Data-Independent Acquisition Proteomics. , 2016 .

[48]  Martin Eisenacher,et al.  Development of data representation standards by the human proteome organization proteomics standards initiative , 2015, J. Am. Medical Informatics Assoc..

[49]  R. Aebersold,et al.  A uniform proteomics MS/MS analysis platform utilizing open XML file formats , 2005, Molecular systems biology.