Proteome-transcriptome alignment of molecular portraits achieved by self-contained gene set analysis: Consensus colon cancer subtypes case study

Gene set analysis (GSA) has become the common methodology for analyzing transcriptomics data. However, self-contained GSA techniques are rarely, if ever, used for proteomics data analysis. Here we present a self-contained proteome level GSA of four consensus molecular subtypes (CMSs) previously established by transcriptome dissection of colon carcinoma specimens. Despite notable difference in structure of proteomics and transcriptomics data, many pathway-wide characteristic features of CMSs found at the mRNA level were reproduced at the protein level. In particular, CMS1 features show heavy involvement of immune system as well as the pathways related to mismatch repair, DNA replication and functioning of proteasome, while CMS4 tumors upregulate complement pathway and proteins participating in epithelial-to-mesenchymal transition (EMT). In addition, protein level GSA yielded a set of novel observations visible at the proteome, but not at the transcriptome level, including possible involvement of major histocompatibility complex II (MHC-II) antigens in the known immunogenicity of CMS1 and a connection between cholesterol trafficking and the regulation of Integrin-linked kinase (ILK) in CMS3. Overall, this study proves utility of self-contained GSA approaches as a critical tool for analyzing proteomics data in general and dissecting protein-level molecular portraits of human tumors in particular.

[1]  Frank Emmert-Streib,et al.  Comparative evaluation of gene set analysis approaches for RNA-Seq data , 2014, BMC Bioinformatics.

[2]  G. Glazko,et al.  Ensuring the statistical soundness of competitive gene set approaches: gene filtering and genome-scale coverage are essential , 2013, Nucleic acids research.

[3]  Ben S. Wittner,et al.  Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1 , 2009, Nature.

[4]  Z. Werb,et al.  The extracellular matrix: A dynamic niche in cancer progression , 2012, The Journal of cell biology.

[5]  M. Imieliński,et al.  In Situ Proteomic Analysis of Human Breast Cancer Epithelial Cells Using Laser Capture Microdissection: Annotation by Protein Set Enrichment Analysis and Gene Ontology* , 2010, Molecular & Cellular Proteomics.

[6]  Jeffrey R. Whiteaker,et al.  Proteogenomic characterization of human colon and rectal cancer , 2014, Nature.

[7]  Antoine M. van Oijen,et al.  Real-time single-molecule observation of rolling-circle DNA replication , 2009, Nucleic acids research.

[8]  Frank Emmert-Streib,et al.  Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline , 2015, Briefings Bioinform..

[9]  C. Ko,et al.  Colon cancer survival rates with the new American Joint Committee on Cancer sixth edition staging. , 2005, Journal of the National Cancer Institute.

[10]  Atul J. Butte,et al.  Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges , 2012, PLoS Comput. Biol..

[11]  William M. Grady,et al.  Epigenetic Alterations in Colorectal Cancer: Emerging Biomarkers. , 2015, Gastroenterology.

[12]  Darryl Shibata,et al.  Ubiquitous somatic mutations in simple repeated sequences reveal a new mechanism for colonic carcinogenesis , 1993, Nature.

[13]  A. Duval,et al.  Immunotherapy and patients treated for cancer with microsatellite instability. , 2017, Bulletin du cancer.

[14]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[15]  K. Kinzler,et al.  The vigorous immune microenvironment of microsatellite instable colon cancer is balanced by multiple counter-inhibitory checkpoints , 2015, Journal of Immunotherapy for Cancer.

[16]  J. Beaulieu,et al.  Integrin-linked kinase regulates migration and proliferation of human intestinal cells under a fibronectin-dependent mechanism , 2010, Journal of cellular physiology.

[17]  Martin Eisenacher,et al.  Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures , 2013, PloS one.

[18]  Frank Emmert-Streib,et al.  GSAR: Bioconductor package for Gene Set analysis in R , 2017, BMC Bioinformatics.

[19]  Z. Kokot,et al.  Mass spectrometry-based proteomics techniques and their application in ovarian cancer research , 2018, Journal of Ovarian Research.

[20]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[21]  J. Friedman,et al.  Multivariate generalizations of the Wald--Wolfowitz and Smirnov two-sample tests , 1979 .

[22]  G. Orphanides,et al.  Subtypes of primary colorectal tumors correlate with response to targeted treatment in colorectal cell lines , 2012, BMC Medical Genomics.

[23]  J. Guinney,et al.  Consensus molecular subtypes and the evolution of precision medicine in colorectal cancer , 2017, Nature Reviews Cancer.

[24]  F. V. Winck,et al.  Functional annotation and biological interpretation of proteomics data. , 2015, Biochimica et biophysica acta.

[25]  Hong Yan,et al.  Molecular subtyping of cancer: current status and moving toward clinical applications , 2019, Briefings Bioinform..

[26]  M. Pino,et al.  The chromosomal instability pathway in colon cancer. , 2010, Gastroenterology.

[27]  Michael A. Freitas,et al.  Tag-Count Analysis of Large-Scale Proteomic Data. , 2016, Journal of proteome research.

[28]  Frank Emmert-Streib,et al.  Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets , 2013, Bioinform..

[29]  Tomas Kalina,et al.  MetaMass, a tool for meta-analysis of subcellular proteomics data , 2016, Nature Methods.

[30]  Korbinian Strimmer,et al.  BMC Bioinformatics BioMed Central Methodology article A general modular framework for gene set enrichment analysis , 2009 .

[31]  J. Galon,et al.  Correlation between Density of CD8+ T-cell Infiltrate in Microsatellite Unstable Colorectal Cancers and Frameshift Mutations: A Rationale for Personalized Immunotherapy. , 2015, Cancer research.

[32]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[33]  A. Jemal,et al.  Cancer statistics, 2012 , 2012, CA: a cancer journal for clinicians.

[34]  Jeffrey S. Morris,et al.  The Consensus Molecular Subtypes of Colorectal Cancer , 2015, Nature Medicine.

[35]  Qi Liu,et al.  Gene-set analysis and reduction , 2008, Briefings Bioinform..

[36]  Marco Y. Hein,et al.  The Perseus computational platform for comprehensive analysis of (prote)omics data , 2016, Nature Methods.

[37]  Henryk Maciejewski,et al.  Gene set analysis methods: statistical models and methodological differences , 2013, Briefings Bioinform..

[38]  Ruedi Aebersold,et al.  Complex‐centric proteome profiling by SEC‐SWATH‐MS , 2019, Nature Protocols.

[39]  Baofeng Yang,et al.  PEP06 polypeptide 30 exerts antitumour effect in colorectal carcinoma via inhibiting epithelial–mesenchymal transition , 2018, British journal of pharmacology.

[40]  Lewis C Cantley,et al.  A colorectal cancer classification system that associates cellular phenotype and responses to therapy , 2013, Nature Medicine.

[41]  Ashley C. Brown,et al.  Synergistic effects of particulate matter and substrate stiffness on epithelial-to-mesenchymal transition. , 2014, Research report.

[42]  J. Guinney,et al.  Erratum: Consensus molecular subtypes and the evolution of precision medicine in colorectal cancer (Nature reviews. Cancer (2017) 17 2 (79-92)) , 2017 .

[43]  Di Wu,et al.  ROAST: rotation gene set tests for complex microarray experiments , 2010, Bioinform..

[44]  Kris Laukens,et al.  Bioinformatics approaches for the functional interpretation of protein lists: From ontology term enrichment to network analysis , 2015, Proteomics.

[45]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Seon-Young Kim,et al.  Gene-set approach for expression pattern analysis , 2008, Briefings Bioinform..

[47]  Peter Bühlmann,et al.  Analyzing gene expression data in terms of gene sets: methodological issues , 2007, Bioinform..

[48]  P. Roche,et al.  The ins and outs of MHC class II-mediated antigen processing and presentation , 2015, Nature Reviews Immunology.

[49]  R. Tibshirani,et al.  On testing the significance of sets of genes , 2006, math/0610667.

[50]  Klaus Jung,et al.  Set-Based Test Procedures for the Functional Analysis of Protein Lists from Differential Analysis. , 2016, Methods in molecular biology.

[51]  Daniel B. McClatchy,et al.  PSEA-Quant: A Protein Set Enrichment Analysis on Label-Free and Label-Based Protein Quantification Data , 2014, Journal of proteome research.

[52]  Christopher J. Ott,et al.  A chemical probe toolbox for dissecting the cancer epigenome , 2017, Nature Reviews Cancer.

[53]  Adam C. Wilkinson,et al.  Branched-chain amino acid metabolism in cancer , 2017, Current opinion in clinical nutrition and metabolic care.

[54]  I. Fournier,et al.  Translating epithelial mesenchymal transition markers into the clinic: Novel insights from proteomics , 2016, EuPA open proteomics.

[55]  A. Vazquez,et al.  Cancer metabolism at a glance , 2016, Journal of Cell Science.

[56]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[57]  Mira Ayadi,et al.  Gene Expression Classification of Colon Cancer into Molecular Subtypes: Characterization, Validation, and Prognostic Value , 2013, PLoS medicine.

[58]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[59]  E. Marcotte,et al.  Insights into the regulation of protein abundance from proteomic and transcriptomic analyses , 2012, Nature Reviews Genetics.

[60]  F. Gao,et al.  Molecular subtyping of colorectal cancer: Recent progress, new challenges and emerging opportunities. , 2019, Seminars in cancer biology.

[61]  Mohammad Al Hasan,et al.  Pathway and network analysis in proteomics. , 2014, Journal of theoretical biology.

[62]  L. Rodrigues,et al.  Colorectal Cancer Cells Increase the Production of Short Chain Fatty Acids by Propionibacterium freudenreichii Impacting on Cancer Cells Survival , 2018, Front. Nutr..

[63]  Roberto Romero,et al.  A Comparison of Gene Set Analysis Methods in Terms of Sensitivity, Prioritization and Specificity , 2013, PloS one.

[64]  Xin Wang,et al.  Dissecting cancer heterogeneity--an unsupervised classification approach. , 2013, The international journal of biochemistry & cell biology.

[65]  B. Bogen,et al.  CD4+ T-cell-Mediated Rejection of MHC Class II-Positive Tumor Cells Is Dependent on Antigen Secretion and Indirect Presentation on Host APCs. , 2018, Cancer research.

[66]  Frank Emmert-Streib,et al.  Pathway Analysis of Expression Data: Deciphering Functional Building Blocks of Complex Diseases , 2011, PLoS Comput. Biol..

[67]  Florian Markowetz,et al.  Poor-prognosis colon cancer is defined by a molecularly distinct subtype and develops from serrated precursor lesions , 2013, Nature Medicine.

[68]  S. Schwartz,et al.  EMT blockage strategies: Targeting Akt dependent mechanisms for breast cancer metastatic behaviour modulation. , 2015, Current gene therapy.

[69]  D. Sargent,et al.  Prediction of overall survival in stage II and III colon cancer beyond TNM system: a retrospective, pooled biomarker study , 2017, Annals of oncology : official journal of the European Society for Medical Oncology.

[70]  Hae-Yun Jung,et al.  Molecular Pathways: Linking Tumor Microenvironment to Epithelial–Mesenchymal Transition in Metastasis , 2014, Clinical Cancer Research.

[71]  M. Mohamadzadeh,et al.  Microbiota impact on the epigenetic regulation of colorectal cancer. , 2013, Trends in molecular medicine.

[72]  David Managadze,et al.  Generalized Portrait of Cancer Metabolic Pathways Inferred from a List of Genes Overexpressed in Cancer , 2014, Genetics research international.

[73]  M. Daly,et al.  PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes , 2003, Nature Genetics.

[74]  Yu Shyr,et al.  Melanoma-specific MHC-II expression represents a tumour-autonomous phenotype and predicts response to anti-PD-1/PD-L1 therapy , 2016, Nature Communications.

[75]  Sabine Tejpar,et al.  Gene expression patterns unveil a new level of molecular heterogeneity in colorectal cancer , 2013, The Journal of pathology.

[76]  P. Park,et al.  Discovering statistically significant pathways in expression profiling studies. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[77]  Andreas Schlicker,et al.  Colorectal cancer intrinsic subtypes predict chemotherapy benefit, deficient mismatch repair and epithelial-to-mesenchymal transition , 2013, International journal of cancer.

[78]  J. Breslow,et al.  Intracellular Cholesterol Transport , 2004, Arteriosclerosis, thrombosis, and vascular biology.