Progress on the HUPO Draft Human Proteome: 2017 Metrics of the Human Proteome Project.

The Human Proteome Organization (HUPO) Human Proteome Project (HPP) continues to make progress on its two overall goals: (1) completing the protein parts list, with an annual update of the HUPO draft human proteome, and (2) making proteomics an integrated complement to genomics and transcriptomics throughout biomedical and life sciences research. neXtProt version 2017-01-23 has 17 008 confident protein identifications (Protein Existence [PE] level 1) that are compliant with the HPP Guidelines v2.1 ( https://hupo.org/Guidelines ), up from 13 664 in 2012-12 and 16 518 in 2016-04. Remaining to be found by mass spectrometry and other methods are 2579 "missing proteins" (PE2+3+4), down from 2949 in 2016. PeptideAtlas 2017-01 has 15 173 canonical proteins, accounting for nearly all of the 15 290 PE1 proteins based on MS data. These resources have extensive data on PTMs, single amino acid variants, and splice isoforms. The Human Protein Atlas v16 has 10 492 highly curated protein entries with tissue and subcellular spatial localization of proteins and transcript expression. Organ-specific popular protein lists have been generated for broad use in quantitative targeted proteomics using SRM-MS or DIA-SWATH-MS studies of biology and disease.

[1]  S. Ranganathan,et al.  Accelerating the search for the missing proteins in the human proteome , 2017, Nature Communications.

[2]  Lennart Martens,et al.  Noncoding after All: Biases in Proteomics Data Do Not Explain Observed Absence of lncRNA Translation Products. , 2017, Journal of proteome research.

[3]  Brendan MacLean,et al.  Building high-quality assay libraries for targeted analysis of SWATH MS data , 2015, Nature Protocols.

[4]  Luis Mendoza,et al.  Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics. , 2016, Journal of proteome research.

[5]  Hongdong Li,et al.  A proteogenomic approach to understand splice isoform functions through sequence and expression-based computational modeling , 2016, Briefings Bioinform..

[6]  Uwe Ohler,et al.  Detecting actively translated open reading frames in ribosome profiling data , 2015, Nature Methods.

[7]  J. Vandesompele,et al.  An update on LNCipedia: a database for annotated human lncRNA sequences , 2015, Nucleic Acids Res..

[8]  A. Nesvizhskii,et al.  Metrics for the Human Proteome Project 2015: Progress on the Human Proteome and Guidelines for High-Confidence Protein Identification. , 2015, Journal of proteome research.

[9]  Lennart Martens,et al.  LNCipedia: a database for annotated human lncRNA transcript sequences and structures , 2012, Nucleic Acids Res..

[10]  Cathy H. Wu,et al.  The Human Proteome Project: Current State and Future Direction , 2011, Molecular & Cellular Proteomics.

[11]  Juan Antonio Vizcaíno,et al.  The ProteomeXchange consortium in 2017: supporting the cultural change in proteomics public data deposition , 2016, Nucleic Acids Res..

[12]  Andrew I Su,et al.  Data-Driven Approach To Determine Popular Proteins for Targeted Proteomics Translation of Six Organ Systems. , 2016, Journal of proteome research.

[13]  A. Heck,et al.  Six alternative proteases for mass spectrometry–based proteomics beyond trypsin , 2016, Nature Protocols.

[14]  G. Omenn,et al.  A first step toward completion of a genome-wide characterization of the human proteome. , 2013, Journal of proteome research.

[15]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.

[16]  Qing-Yu He,et al.  Translating mRNAs strongly correlate to proteins in a multivariate manner and their translation ratios are phenotype specific , 2013, Nucleic acids research.

[17]  Chih-Chiang Tsou,et al.  DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics , 2015, Nature Methods.

[18]  J. Rinn,et al.  Modular regulatory principles of large non-coding RNAs , 2012, Nature.

[19]  Amos Bairoch,et al.  The neXtProt knowledgebase on human proteins: 2017 update , 2016, Nucleic Acids Res..

[20]  Nichole L. King,et al.  Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry , 2004, Genome Biology.

[21]  David A. Rendon,et al.  Asprosin, a Fasting-Induced Glucogenic Protein Hormone , 2016, Cell.

[22]  Sumio Sugano,et al.  Diversity of Translation Start Sites May Define Increased Complexity of the Human Short ORFeome*S , 2007, Molecular & Cellular Proteomics.

[23]  A. Nesvizhskii Proteogenomics: concepts, applications and computational strategies , 2014, Nature Methods.

[24]  Jiao Ma,et al.  A human microprotein that interacts with the mRNA decapping complex , 2016, Nature chemical biology.

[25]  Nuno Bandeira,et al.  Human Proteome Project Mass Spectrometry Data Interpretation Guidelines 3.0. , 2019, Journal of proteome research.

[26]  C. Lindskog,et al.  Validating Missing Proteins in Human Sperm Cells by Targeted Mass-Spectrometry- and Antibody-based Methods. , 2017, Journal of proteome research.

[27]  Nicholas T. Ingolia,et al.  Ribosome Profiling Provides Evidence that Large Noncoding RNAs Do Not Encode Proteins , 2013, Cell.

[28]  Emma Lundberg,et al.  A proposal for validation of antibodies , 2016, Nature Methods.

[29]  Lennart Martens,et al.  A Golden Age for Working with Public Proteomics Data , 2017, Trends in biochemical sciences.

[30]  Jens Nielsen,et al.  Transcriptomics resources of human tissues and organs , 2016, Molecular systems biology.

[31]  Brendan MacLean,et al.  Bioinformatics Applications Note Gene Expression Skyline: an Open Source Document Editor for Creating and Analyzing Targeted Proteomics Experiments , 2022 .

[32]  Yang Zhang,et al.  A Network of Splice Isoforms for the Mouse , 2016, Scientific Reports.

[33]  David D. Shteynberg,et al.  State of the Human Proteome in 2014/2015 As Viewed through PeptideAtlas: Enhancing Accuracy and Coverage through the AtlasProphet. , 2015, Journal of proteome research.

[34]  Lydie Lane,et al.  Metrics for the Human Proteome Project 2016: Progress on Identifying and Characterizing the Human Proteome, Including Post-Translational Modifications. , 2016, Journal of proteome research.

[35]  Thibault Robin,et al.  Looking for Missing Proteins in the Proteome of Human Spermatozoa: An Update. , 2016, Journal of proteome research.

[36]  S. Dhanasekaran,et al.  The landscape of long noncoding RNAs in the human transcriptome , 2015, Nature Genetics.

[37]  Yan Ren,et al.  Insights from ENCODE on Missing Proteins: Why β-Defensin Expression Is Scarcely Detected. , 2015, Journal of proteome research.

[38]  A. Bairoch,et al.  Missing Protein Landscape of Human Chromosomes 2 and 14: Progress and Current Status. , 2016, Journal of proteome research.

[39]  E. Lundberg,et al.  Creation of an antibody‐based subcellular protein atlas , 2010, Proteomics.

[40]  Mathias Wilhelm,et al.  Building ProteomeTools based on a complete synthetic human proteome , 2017, Nature Methods.

[41]  Amos Bairoch,et al.  Metrics for the Human Proteome Project 2013-2014 and strategies for finding missing proteins. , 2014, Journal of proteome research.

[42]  Chris Sander,et al.  Human SRMAtlas: A Resource of Targeted Assays to Quantify the Complete Human Proteome , 2016, Cell.

[43]  P. Pavlidis,et al.  Can we predict protein from mRNA levels? , 2017, Nature.

[44]  Mathieu Schaeffer,et al.  The neXtProt peptide uniqueness checker: a tool for the proteomics community , 2017, Bioinform..

[45]  F. He,et al.  Deep Coverage Proteomics Identifies More Low-Abundance Missing Proteins in Human Testis Tissue with Q-Exactive HF Mass Spectrometer. , 2016, Journal of proteome research.

[46]  S. Hanash,et al.  A chromosome-centric human proteome project (C-HPP) to characterize the sets of proteins encoded in chromosome 17. , 2013, Journal of proteome research.

[47]  J. Rinn,et al.  Peptidomic discovery of short open reading frame-encoded peptides in human cells , 2012, Nature chemical biology.

[48]  José A. Dianes,et al.  2016 update of the PRIDE database and its related tools , 2016, Nucleic Acids Res..

[49]  Ben C. Collins,et al.  OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data , 2014, Nature Biotechnology.