Integration of Proteomics and Transcriptomics Data Sets for the Analysis of a Lymphoma B-Cell Line in the Context of the Chromosome-Centric Human Proteome Project.

A comprehensive study of the molecular active landscape of human cells can be undertaken to integrate two different but complementary perspectives: transcriptomics, and proteomics. After the genome era, proteomics has emerged as a powerful tool to simultaneously identify and characterize the compendium of thousands of different proteins active in a cell. Thus, the Chromosome-centric Human Proteome Project (C-HPP) is promoting a full characterization of the human proteome combining high-throughput proteomics with the data derived from genome-wide expression profiling of protein-coding genes. Here we present a full proteomic profiling of a human lymphoma B-cell line (Ramos) performed using a nanoUPLC-LTQ-Orbitrap Velos proteomic platform, combined to an in-depth transcriptomic profiling of the same cell type. Data are available via ProteomeXchange with identifier PXD001933. Integration of the proteomic and transcriptomic data sets revealed a 94% overlap in the proteins identified by both -omics approaches. Moreover, functional enrichment analysis of the proteomic profiles showed an enrichment of several functions directly related to the biological and morphological characteristics of B-cells. In turn, about 30% of all protein-coding genes present in the whole human genome were identified as being expressed by the Ramos cells (stable average of 30% genes along all the chromosomes), revealing the size of the protein expression-set present in one specific human cell type. Additionally, the identification of missing proteins in our data sets has been reported, highlighting the power of the approach. Also, a comparison between neXtProt and UniProt database searches has been performed. In summary, our transcriptomic and proteomic experimental profiling provided a high coverage report of the expressed proteome from a human lymphoma B-cell type with a clear insight into the biological processes that characterized these cells. In this way, we demonstrated the usefulness of combining -omics for a comprehensive characterization of specific biological systems.

[1]  M. Mann,et al.  Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips , 2007, Nature Protocols.

[2]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[3]  O. Jensen Interpreting the protein language using proteomics , 2006, Nature Reviews Molecular Cell Biology.

[4]  Richard D. Smith,et al.  Proteogenomics: needs and roles to be filled by proteomics in genome annotation. , 2008, Briefings in functional genomics & proteomics.

[5]  J. Buhmann,et al.  Protein Identification False Discovery Rates for Very Large Proteomics Data Sets Generated by Tandem Mass Spectrometry* , 2009, Molecular & Cellular Proteomics.

[6]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[7]  Andrew R. Jones,et al.  ProteomeXchange provides globally co-ordinated proteomics data submission and dissemination , 2014, Nature Biotechnology.

[8]  R. Myers,et al.  Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data , 2005, Nucleic acids research.

[9]  M. Selbach,et al.  Global quantification of mammalian gene expression control , 2011, Nature.

[10]  M. Mann,et al.  Parts per Million Mass Accuracy on an Orbitrap Mass Spectrometer via Lock Mass Injection into a C-trap*S , 2005, Molecular & Cellular Proteomics.

[11]  V. Bafna,et al.  Proteogenomics to discover the full coding content of genomes: a computational perspective. , 2010, Journal of proteomics.

[12]  Mingyao Li,et al.  RNA-sequence analysis of human B-cells. , 2011, Genome research.

[13]  Cathy H. Wu,et al.  The Human Proteome Project: Current State and Future Direction , 2011, Molecular & Cellular Proteomics.

[14]  S. Hanash,et al.  Standard guidelines for the chromosome-centric human proteome project. , 2012, Journal of proteome research.

[15]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[16]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[17]  Gorka Prieto,et al.  PAnalyzer: A software tool for protein inference in shotgun proteomics , 2012, BMC Bioinformatics.

[18]  Gennifer E. Merrihew,et al.  Proteogenomic database construction driven from large scale RNA-seq data. , 2014, Journal of proteome research.

[19]  B. E. C. Oiffier,et al.  CHOP Chemotherapy plus Rituximab Compared with CHOP Alone in Elderly Patients with Diffuse Large-B-Cell Lymphoma , 2002 .

[20]  Albert J. R. Heck,et al.  From the human genome to the human proteome. , 2014, Angewandte Chemie.

[21]  Jürgen Cox,et al.  Super-SILAC Allows Classification of Diffuse Large B-cell Lymphoma Subtypes by Their Protein Expression Profiles* , 2012, Molecular & Cellular Proteomics.

[22]  R. Pal,et al.  Send Orders of Reprints at Reprints@benthamscience.net Integrated Analysis of Transcriptomic and Proteomic Data , 2022 .

[23]  James A. Cuff,et al.  Distinguishing protein-coding and noncoding genes in the human genome , 2007, Proceedings of the National Academy of Sciences.

[24]  E. Birney,et al.  Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt , 2009, Nature Protocols.

[25]  M. Mann,et al.  In-gel digestion for mass spectrometric characterization of proteins and proteomes , 2006, Nature Protocols.

[26]  Sven Nahnsen,et al.  Mass spectrometry at the interface of proteomics and genomics. , 2011, Molecular bioSystems.

[27]  Rafael A. Irizarry,et al.  A framework for oligonucleotide microarray preprocessing , 2010, Bioinform..

[28]  A. Nesvizhskii Proteogenomics: concepts, applications and computational strategies , 2014, Nature Methods.

[29]  M. Huynen,et al.  Shaping the mitochondrial proteome. , 2004, Biochimica et biophysica acta.

[30]  L. Huber,et al.  Organelle Proteomics Implications for Subcellular Fractionation in Proteomics , 2003 .

[31]  S. Hober,et al.  Chromosome 19 annotations with disease speciation: a first report from the Global Research Consortium. , 2013, Journal of proteome research.

[32]  D. Goldstein,et al.  Proteogenomic studies in epithelial ovarian cancer: established knowledge and future needs. , 2009, Biomarkers in medicine.

[33]  A. Pascual-Montano,et al.  Surfing transcriptomic landscapes. A step beyond the annotation of chromosome 16 proteome. , 2014, Journal of proteome research.

[34]  Celia Fontanillo,et al.  Functional Analysis beyond Enrichment: Non-Redundant Reciprocal Linkage of Genes and Biological Terms , 2011, PloS one.

[35]  Andrew Emili,et al.  Integrating gene and protein expression data: pattern analysis and profile mining. , 2005, Methods.

[36]  J Ignacio Casal,et al.  Spanish human proteome project: dissection of chromosome 16. , 2013, Journal of proteome research.

[37]  Markus Brosch,et al.  Accurate and sensitive peptide identification with Mascot Percolator. , 2009, Journal of proteome research.