Gene‐specific correlation of RNA and protein levels in human cells and tissues

An important issue for molecular biology is to establish whether transcript levels of a given gene can be used as proxies for the corresponding protein levels. Here, we have developed a targeted proteomics approach for a set of human non‐secreted proteins based on parallel reaction monitoring to measure, at steady‐state conditions, absolute protein copy numbers across human tissues and cell lines and compared these levels with the corresponding mRNA levels using transcriptomics. The study shows that the transcript and protein levels do not correlate well unless a gene‐specific RNA‐to‐protein (RTP) conversion factor independent of the tissue type is introduced, thus significantly enhancing the predictability of protein copy numbers from RNA levels. The results show that the RTP ratio varies significantly with a few hundred copies per mRNA molecule for some genes to several hundred thousands of protein copies per mRNA molecule for others. In conclusion, our data suggest that transcriptome analysis can be used as a tool to predict the protein copy numbers per cell, thus forming an attractive link between the field of genomics and proteomics.

[1]  J. Seilhamer,et al.  A comparison of selected mRNA and protein abundances in human liver , 1997, Electrophoresis.

[2]  J. O. Thomas,et al.  Histone H1: location and role. , 1999, Current opinion in cell biology.

[3]  M. Uhlén,et al.  High-throughput protein expression of cDNA products as a tool in functional genomics. , 2000, Journal of biotechnology.

[4]  S. Gygi,et al.  Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[5]  F. Studier,et al.  Protein production by auto-induction in high density shaking cultures. , 2005, Protein expression and purification.

[6]  M. Mann,et al.  Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips , 2007, Nature Protocols.

[7]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[8]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[9]  Sophia Hober,et al.  High‐throughput protein production – Lessons from scaling up from 10 to 288 recombinant proteins per week , 2009, Biotechnology journal.

[10]  Luis Serrano,et al.  Correlation of mRNA and protein in complex biological samples , 2009, FEBS letters.

[11]  F. Pontén,et al.  Correlations between RNA and protein expression profiles in 23 human cell lines , 2009, BMC Genomics.

[12]  C. Eyers Universal sample preparation method for proteome analysis , 2009 .

[13]  M. Mann,et al.  Universal sample preparation method for proteome analysis , 2009, Nature Methods.

[14]  Brendan MacLean,et al.  Skyline: an open source document editor for creating and analyzing targeted proteomics experiments , 2010, Bioinform..

[15]  M. Mann,et al.  Defining the transcriptome and proteome in three functionally different human cell lines , 2010, Molecular systems biology.

[16]  E. Lundberg,et al.  Creation of an antibody‐based subcellular protein atlas , 2010, Proteomics.

[17]  Emma Lundberg,et al.  A Protein Epitope Signature Tag (PrEST) Library Allows SILAC-based Absolute Quantification and Multiplexed Determination of Protein Copy Numbers in Cell Lines* , 2011, Molecular & Cellular Proteomics.

[18]  M. Selbach,et al.  Global quantification of mammalian gene expression control , 2011, Nature.

[19]  R. Aebersold,et al.  Quantification of mRNA and protein and integration with protein turnover in a bacterium , 2011, Molecular systems biology.

[20]  Martin Kircher,et al.  Deep proteome and transcriptome mapping of a human cancer cell line , 2011, Molecular systems biology.

[21]  I. Matic,et al.  Absolute SILAC-Compatible Expression Strain Allows Sumo-2 Copy Number Determination in Clinical Samples , 2011, Journal of proteome research.

[22]  E. Marcotte,et al.  Insights into the regulation of protein abundance from proteomic and transcriptomic analyses , 2012, Nature Reviews Genetics.

[23]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[24]  Raymond K. Auerbach,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[25]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[26]  B. Domon,et al.  Targeted Proteomic Quantification on Quadrupole-Orbitrap Mass Spectrometer* , 2012, Molecular & Cellular Proteomics.

[27]  Alexander Schmidt,et al.  Critical assessment of proteome‐wide label‐free absolute abundance estimation strategies , 2013, Proteomics.

[28]  M. Selbach,et al.  Corrigendum: Global quantification of mammalian gene expression control , 2013, Nature.

[29]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.

[30]  R. Milo What is the total number of protein molecules per cell volume? A call to rethink some published values , 2013, BioEssays : news and reviews in molecular, cellular and developmental biology.

[31]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[32]  Marco Y. Hein,et al.  A “Proteomic Ruler” for Protein Copy Number and Concentration Estimation without Spike-in Standards* , 2014, Molecular & Cellular Proteomics.

[33]  B. Kuster,et al.  Mass-spectrometry-based draft of the human proteome , 2014, Nature.

[34]  Dmitri D. Pervouchine,et al.  The human transcriptome across tissues and individuals , 2015, Science.

[35]  Samuel H. Payne,et al.  The utility of protein and mRNA correlation. , 2015, Trends in biochemical sciences.

[36]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.

[37]  Craig Lawless,et al.  Direct and Absolute Quantification of over 1800 Yeast Proteins via Selected Reaction Monitoring* , 2016, Molecular & Cellular Proteomics.

[38]  Lior Pachter,et al.  Near-optimal probabilistic RNA-seq quantification , 2016, Nature Biotechnology.

[39]  Nuno A. Fonseca,et al.  Expression Atlas update—an integrated database of gene and protein expression in humans, animals and plants , 2015, Nucleic Acids Res..