Integrated View of Baseline Protein Expression in Human Tissues

The availability of proteomics datasets in the public domain, and in the PRIDE database in particular, has increased dramatically in recent years. This unprecedented large-scale availability of data provides an opportunity for combined analyses of datasets to get organism-wide protein abundance data in a consistent manner. We have reanalysed 24 public proteomics datasets from healthy human individuals, to assess baseline protein abundance in 31 organs. We defined tissue as a distinct functional or structural region within an organ. Overall, the aggregated dataset contains 67 healthy tissues, corresponding to 3,119 mass spectrometry runs covering 498 samples, coming from 489 individuals. We compared protein abundances between the different organs and studied the distribution of proteins across organs. We also compared the results with data generated in analogous studies. We also performed gene ontology and pathway enrichment analyses to identify organ-specific enriched biological processes and pathways. As a key point, we have integrated the protein abundance results into the resource Expression Atlas, where it can be accessed and visualised either individually or together with gene expression data coming from transcriptomics datasets. We believe this is a good mechanism to make proteomics data more accessible for life scientists.

[1]  B. Kuster,et al.  Reanalysis of ProteomicsDB Using an Accurate, Sensitive, and Scalable False Discovery Rate Estimation Approach for Protein Groups , 2022, Molecular & cellular proteomics : MCP.

[2]  P. Moreno,et al.  Integrated view and comparative analysis of baseline protein expression in mouse and rat tissues , 2021, bioRxiv.

[3]  A. Gonzáléz-Pérez,et al.  Integrated Genomic, Transcriptomic and Proteomic Analysis for Identifying Markers of Alzheimer’s Disease , 2021, Diagnostics.

[4]  D. Hassabis,et al.  Protein complex prediction with AlphaFold-Multimer , 2021, bioRxiv.

[5]  P. Moreno,et al.  Implementing the reuse of public DIA proteomics datasets: from the PRIDE database to Expression Atlas , 2021, Scientific Data.

[6]  A. Brazma,et al.  A proteomics sample metadata representation for multiomics integration and big data analysis , 2021, Nature Communications.

[7]  Peter B. McGarvey,et al.  UniProt: the universal protein knowledgebase in 2021 , 2020, Nucleic Acids Res..

[8]  Aïda Ouangraoua,et al.  OpenProt 2021: deeper functional annotation of the coding potential of eukaryotic genomes , 2020, Nucleic Acids Res..

[9]  Astrid Gall,et al.  Ensembl 2021 , 2020, Nucleic Acids Res..

[10]  Rebekah L. Gundry,et al.  A high-stringency blueprint of the human proteome , 2020, Nature Communications.

[11]  Edgars Celms,et al.  Using Deep Learning to Extrapolate Protein Expression Measurements , 2020, Proteomics.

[12]  A. Vlahou,et al.  Insights into Biomechanical and Proteomic Characteristics of Small Diameter Vascular Grafts Utilizing the Human Umbilical Artery , 2020, Biomedicines.

[13]  W. Vranken,et al.  Scop3P: a comprehensive resource of human phosphosites within their full context. , 2020, Journal of proteome research.

[14]  Irving E. Vega,et al.  Hemispheric asymmetry in the human brain and in Parkinson’s disease is linked to divergent epigenetic patterns in neurons , 2020, Genome Biology.

[15]  Mathias Wilhelm,et al.  Mass-spectrometry-based draft of the Arabidopsis proteome , 2020, Nature.

[16]  Shiva Kumar,et al.  Multi-omics Data Integration, Interpretation, and Its Application , 2020, Bioinformatics and biology insights.

[17]  Amos Bairoch,et al.  The neXtProt knowledgebase in 2020: data, tools and usability improvements , 2019, Nucleic Acids Res..

[18]  Nuno A. Fonseca,et al.  Expression Atlas update: from tissues to single cells , 2019, Nucleic Acids Res..

[19]  Helmut Krcmar,et al.  ProteomicsDB: a multi-omics and multi-organism resource for life science research , 2019, Nucleic Acids Res..

[20]  Christopher D. Brown,et al.  A Quantitative Proteome Map of the Human Body , 2019, Cell.

[21]  M. Schrader,et al.  Co-regulation map of the human proteome enables identification of protein functions , 2019, Nature Biotechnology.

[22]  Bin Zhang,et al.  Large-scale proteomic analysis of Alzheimer’s disease brain and cerebrospinal fluid reveals early changes in energy metabolism associated with microglia and astrocyte activation , 2019, bioRxiv.

[23]  J. Barnard,et al.  Proteomic Investigations of Autism Brain Identify Known and Novel Pathogenetic Processes , 2019, Scientific Reports.

[24]  Lennart Martens,et al.  Scop3P: a comprehensive resource of human phosphosites within their full context , 2019, bioRxiv.

[25]  Andrew F. Jarnuczak,et al.  An integrated landscape of protein expression in human cancer , 2019, bioRxiv.

[26]  J. Cox,et al.  High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis , 2019, Nature Methods.

[27]  Lennart Martens,et al.  Updated MS²PIP web server delivers fast and accurate MS² peak intensity prediction for multiple fragmentation methods, instruments and labeling techniques , 2019, Nucleic Acids Res..

[28]  S. Salamat,et al.  Proteomic Atlas of the Human Brain in Alzheimer's Disease. , 2019, Journal of proteome research.

[29]  Juan Antonio Vizcaíno,et al.  The functional landscape of the human phosphoproteome , 2019, Nature Biotechnology.

[30]  Martin Eisenacher,et al.  The PRIDE database and related tools and resources in 2019: improving support for quantification data , 2018, Nucleic Acids Res..

[31]  Nuno A. Fonseca,et al.  ArrayExpress update – from bulk to single-cell expression data , 2018, Nucleic Acids Res..

[32]  Nandini A. Sahasrabuddhe,et al.  Proteomic Analysis of the Human Anterior Pituitary Gland. , 2018, Omics : a journal of integrative biology.

[33]  Samuel H. Payne,et al.  Individual Variability of Protein Expression in Human Tissues. , 2018, Journal of proteome research.

[34]  Benjamin A. Logsdon,et al.  The Mount Sinai cohort of large-scale genomic, transcriptomic and proteomic data in Alzheimer's disease , 2018, Scientific Data.

[35]  Mathias Wilhelm,et al.  A deep proteome and transcriptome abundance atlas of 29 healthy human tissues , 2018, bioRxiv.

[36]  K. Margulies,et al.  Suppression of detyrosinated microtubules improves cardiomyocyte function in human heart failure , 2018, Nature Medicine.

[37]  Helmut Krcmar,et al.  ProteomicsDB , 2017, Nucleic Acids Res..

[38]  Philipp E. Geyer,et al.  Region and cell-type resolved quantitative proteomic map of the human heart , 2017, Nature Communications.

[39]  X. Gallart‐Palau,et al.  Brain ureido degenerative protein modifications are associated with neuroinflammation and proteinopathy in Alzheimer’s disease with cerebrovascular disease , 2017, Journal of Neuroinflammation.

[40]  D. Cutler,et al.  Integrating Next-Generation Genomic Sequencing and Mass Spectrometry To Estimate Allele-Specific Protein Abundance in Human Brain. , 2017, Journal of proteome research.

[41]  Guanming Wu,et al.  Functional Interaction Network Construction and Analysis for Disease Discovery. , 2017, Methods in molecular biology.

[42]  A. Madugundu,et al.  Characterization of human pineal gland proteome. , 2016, Molecular bioSystems.

[43]  Jüergen Cox,et al.  The MaxQuant computational platform for mass spectrometry-based shotgun proteomics , 2016, Nature Protocols.

[44]  James A. Eddy,et al.  Human whole genome genotype and transcriptome data for Alzheimer’s and other neurodegenerative diseases , 2016, Scientific Data.

[45]  A. Stensballe,et al.  Proteome stability analysis of snap frozen, RNAlater preserved, and formalin-fixed paraffin-embedded human colon mucosal biopsies , 2016, Data in brief.

[46]  H. Shill,et al.  Arizona Study of Aging and Neurodegenerative Disorders and Brain and Body Donation Program , 2015, Neuropathology : official journal of the Japanese Society of Neuropathology.

[47]  M. Bøgsted,et al.  Neutrophil Extracellular Traps in Ulcerative Colitis: A Proteome Analysis of Intestinal Biopsies , 2015, Inflammatory bowel diseases.

[48]  Mathias Wilhelm,et al.  A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets , 2015, Molecular & Cellular Proteomics.

[49]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.

[50]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[51]  W. Jaeger,et al.  Proteome profiling of breast cancer biopsies reveals a wound healing signature of cancer-associated fibroblasts. , 2014, Journal of proteome research.

[52]  B. Kuster,et al.  Mass-spectrometry-based draft of the human proteome , 2014, Nature.

[53]  Gary D Bader,et al.  A draft map of the human proteome , 2014, Nature.

[54]  Andrew R. Jones,et al.  ProteomeXchange provides globally co-ordinated proteomics data submission and dissemination , 2014, Nature Biotechnology.

[55]  C. Turck,et al.  Deciphering the human brain proteome: characterization of the anterior temporal lobe and corpus callosum as part of the Chromosome 15-centric Human Proteome Project. , 2014, Journal of proteome research.

[56]  A. Brazma,et al.  Reuse of public genome-wide gene expression data , 2012, Nature Reviews Genetics.

[57]  Guangchuang Yu,et al.  clusterProfiler: an R package for comparing biological themes among gene clusters. , 2012, Omics : a journal of integrative biology.

[58]  L. Ferrucci,et al.  Neuropathologic studies of the Baltimore Longitudinal Study of Aging (BLSA). , 2009, Journal of Alzheimer's disease : JAD.

[59]  Ge Li,et al.  Neuropathology in the adult changes in thought study: a review. , 2009, Journal of Alzheimer's disease : JAD.

[60]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[61]  Henry H. N. Lam,et al.  PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows , 2008, EMBO reports.

[62]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.