Using Deep Learning to Extrapolate Protein Expression Measurements

Mass spectrometry (MS)‐based quantitative proteomics experiments typically assay a subset of up to 60% of the ≈20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing.

[1]  Rodrigo Dienstmann,et al.  Genomic Determinants of Protein Abundance Variation in Colorectal Cancer Cells , 2016, bioRxiv.

[2]  Richard D Smith,et al.  Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. , 2015, Journal of proteome research.

[3]  P. Pavlidis,et al.  Can we predict protein from mRNA levels? , 2017, Nature.

[4]  Thomas A. Hopf,et al.  Quantification and discovery of sequence determinants of protein‐per‐mRNA amount in 29 human tissues , 2018, bioRxiv.

[5]  S. Le,et al.  Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line , 2010, Molecular systems biology.

[6]  Mathias Wilhelm,et al.  Global proteome analysis of the NCI-60 cell line panel. , 2013, Cell reports.

[7]  James C. Wright,et al.  Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow , 2016, Nature Communications.

[8]  Mathias Wilhelm,et al.  A deep proteome and transcriptome abundance atlas of 29 healthy human tissues , 2018, bioRxiv.

[9]  Silvio C. E. Tosatto,et al.  PlaToLoCo: the first web meta-server for visualization and annotation of low complexity regions in proteins , 2020, Nucleic Acids Res..

[10]  T. Mikkelsen,et al.  Cellular source and mechanisms of high transcriptome complexity in the mammalian testis. , 2013, Cell reports.

[11]  F. Edfors,et al.  Gene‐specific correlation of RNA and protein levels in human cells and tissues , 2016, Molecular systems biology.

[12]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[13]  C. Martin 2015 , 2015, Les 25 ans de l’OMC: Une rétrospective en photos.

[14]  B. Kuster,et al.  Mass-spectrometry-based draft of the human proteome , 2014, Nature.

[15]  David R. Kelley,et al.  Sequential regulatory activity prediction across chromosomes with convolutional neural networks. , 2018, Genome research.

[16]  M. Selbach,et al.  Global quantification of mammalian gene expression control , 2011, Nature.

[17]  Ruedi Aebersold,et al.  Proteomics goes parallel , 2018, Nature Biotechnology.

[18]  Maxwell R. Mumbach,et al.  Dynamic profiling of the protein life cycle in response to pathogens , 2015, Science.

[19]  C. Ponting,et al.  Identification of functional long non-coding RNAs in C. elegans , 2018, BMC Biology.

[20]  Edward L. Huttlin,et al.  A Tissue-Specific Atlas of Mouse Protein Phosphorylation and Expression , 2010, Cell.

[21]  M. Tress,et al.  Analyzing the First Drafts of the Human Proteome , 2014, Journal of proteome research.

[22]  Nico C. van de Merbel,et al.  Protein quantification by LC-MS: a decade of progress through the pages of Bioanalysis. , 2019 .

[23]  R. Aebersold,et al.  Quantitative Analysis of Fission Yeast Transcriptomes and Proteomes in Proliferating and Quiescent Cells , 2012, Cell.

[24]  Samuel H. Payne,et al.  Crowdsourced Assessment of the of Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics. , 2020, Cell systems.

[25]  C. Babbitt,et al.  The predictive nature of transcript expression levels on protein expression in adult human brain , 2017, BMC Genomics.

[26]  Laurent Gatto,et al.  Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. , 2016, Journal of proteome research.

[27]  Martin Eisenacher,et al.  The PRIDE database and related tools and resources in 2019: improving support for quantification data , 2018, Nucleic Acids Res..

[28]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.

[29]  Gary D Bader,et al.  A draft map of the human proteome , 2014, Nature.

[30]  Hiromi W L Koh,et al.  Differential dynamics of the mammalian mRNA and protein expression response to misfolding stress , 2015, bioRxiv.

[31]  R. Aebersold,et al.  On the Dependency of Cellular Protein Levels on mRNA Abundance , 2016, Cell.

[32]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[33]  R. Zubarev The challenge of the proteome dynamic range and its implications for in‐depth proteomics , 2013, Proteomics.