mCOPA: analysis of heterogeneous features in cancer expression data

BackgroundCancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset.ResultsWe compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours.ConclusionsWe demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers.

[1]  Marrije R Buist,et al.  Gene expression in early stage cervical cancer. , 2008, Gynecologic oncology.

[2]  C. Cano,et al.  Expression Profiling of Breast Tumors Based on Human Epidermal Growth Factor Receptor 2 Status Defines Migration-Related Genes , 2012, Pathobiology.

[3]  Stephan Frank,et al.  MAP kinase-interacting kinase 1 regulates SMAD2-dependent TGF-β signaling pathway in human glioblastoma. , 2011, Cancer research.

[4]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[5]  Mark R. Green,et al.  Targeting targeted therapy. , 2004, The New England journal of medicine.

[6]  Roberto Bianco,et al.  Involvement of growth factor receptors of the epidermal growth factor receptor family in prostate cancer development and progression to androgen independence. , 2003, Clinical prostate cancer.

[7]  J D Siegal,et al.  Enhanced expression of the c‐myc protooncogene in high‐grade human prostate cancers , 1988, The Prostate.

[8]  J. Gu,et al.  PTEN gene and integrin signaling in cancer. , 1999, Journal of the National Cancer Institute.

[9]  Ariel S. Schwartz,et al.  An Atlas of Combinatorial Transcriptional Regulation in Mouse and Man , 2010, Cell.

[10]  Mario Mazzocchi,et al.  Statistics for Marketing and Consumer Research , 2008 .

[11]  Dung-Tsa Chen,et al.  The distribution-based p-value for the outlier sum in differential gene expression analysis. , 2010, Biometrika.

[12]  N. Robert,et al.  Multinational study of the efficacy and safety of humanized anti-HER2 monoclonal antibody in women who have HER2-overexpressing metastatic breast cancer that has progressed after chemotherapy for metastatic disease. , 1999, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[13]  M. Newton,et al.  Genes Involved in DNA Repair and Nitrosamine Metabolism and Those Located on Chromosome 14q32 Are Dysregulated in Nasopharyngeal Carcinoma , 2006, Cancer Epidemiology Biomarkers & Prevention.

[14]  J. Tchinda,et al.  Recurrent Fusion of TMPRSS2 and ETS Transcription Factor Genes in Prostate Cancer , 2005, Science.

[15]  Toshihiko Yamashita,et al.  ErbB2 expression is correlated with increased survival of patients with osteosarcoma , 2002, Cancer.

[16]  Leonard J Foster,et al.  Expression of PTRF in PC-3 Cells Modulates Cholesterol Dynamics and the Actin Cytoskeleton Impacting Secretion Pathways* , 2011, Molecular & Cellular Proteomics.

[17]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[18]  Anthony Bosco,et al.  Decreased activation of inflammatory networks during acute asthma exacerbations is associated with chronic airflow obstruction , 2010, Mucosal Immunology.

[19]  Carine Maenhaut,et al.  Role of Epac and protein kinase A in thyrotropin-induced gene expression in primary thyrocytes. , 2012, Experimental cell research.

[20]  Ximing J. Yang,et al.  The E2F3-Oncomir-1 axis is activated in Wilms' tumor. , 2008, Cancer research.

[21]  Melissa J. Davis,et al.  Gene regulatory network inference: evaluation and application to ovarian cancer allows the prioritization of drug targets , 2012, Genome Medicine.

[22]  Melissa J. Davis,et al.  Rewiring the dynamic interactome. , 2012, Molecular bioSystems.

[23]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[24]  John T. Wei,et al.  Integrative molecular concept modeling of prostate cancer progression , 2007, Nature Genetics.

[25]  J. Tchinda,et al.  Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. , 2006, Science.

[26]  Pier Paolo Pandolfi,et al.  Aberrant ERG expression cooperates with loss of PTEN to promote cancer progression in the prostate , 2009, Nature Genetics.

[27]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[28]  Marie Joseph,et al.  Gene Signatures of Progression and Metastasis in Renal Cell Cancer , 2005, Clinical Cancer Research.

[29]  Xiaogang Wang,et al.  Clues: an R Package for Nonparametric Clustering Based on Local Shrinking , 2022 .

[30]  Debashis Ghosh,et al.  COPA - cancer outlier profile analysis , 2006, Bioinform..

[31]  Tae-Min Kim,et al.  Fibrinogen β-derived Bβ(15-42) peptide protects against kidney ischemia/reperfusion injury. , 2011, Blood.

[32]  Mark J. Bowser,et al.  EGFR-induced cell migration is mediated predominantly by the JAK-STAT pathway in primary esophageal keratinocytes. , 2004, American journal of physiology. Gastrointestinal and liver physiology.

[33]  A. Joe,et al.  Mechanisms of Disease: oncogene addiction—a rationale for molecular targeting in cancer therapy , 2006, Nature Clinical Practice Oncology.

[34]  David Stokoe,et al.  New insights into PTEN , 2007, Journal of Cell Science.

[35]  Börje Ljungberg,et al.  High-resolution DNA copy number and gene expression analyses distinguish chromophobe renal cell carcinomas and renal oncocytomas , 2009, BMC Cancer.

[36]  Alma L Burlingame,et al.  Mass Spectrometry: Reconnaissance at the Frontiers of Biology , 2012, Molecular & Cellular Proteomics.

[37]  Baolin Wu,et al.  Cancer outlier differential gene expression detection. , 2007, Biostatistics.

[38]  R. Tibshirani,et al.  Outlier sums for differential gene expression analysis. , 2007, Biostatistics.

[39]  Li Li,et al.  PADGE: analysis of heterogeneous patterns of differential gene expression. , 2007, Physiological genomics.

[40]  Martin-Leo Hansmann,et al.  Origin and pathogenesis of nodular lymphocyte–predominant Hodgkin lymphoma as revealed by global gene expression analysis , 2008, The Journal of experimental medicine.

[41]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[42]  Stefania Staibano,et al.  Expression of epidermal growth factor receptor correlates with disease relapse and progression to androgen-independence in human prostate cancer. , 2002, Clinical cancer research : an official journal of the American Association for Cancer Research.

[43]  F. Cavalli,et al.  PTEN loss of expression predicts cetuximab efficacy in metastatic colorectal cancer patients , 2007, British Journal of Cancer.

[44]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[45]  Raymond R Tubbs,et al.  Breast Cancers With Brain Metastases are More Likely to be Estrogen Receptor Negative, Express the Basal Cytokeratin CK5/6, and Overexpress HER2 or EGFR , 2006, The American journal of surgical pathology.

[46]  Lodewyk F. A. Wessels,et al.  Concordance of clinical and molecular breast cancer subtyping in the context of preoperative chemotherapy response , 2009, Breast Cancer Research and Treatment.

[47]  Michael J Sweredoski,et al.  Novel Proteomic Tools Reveal Essential Roles of SRP and Importance of Proper Membrane Protein Biogenesis* , 2011, Molecular & Cellular Proteomics.

[48]  Paul Perco,et al.  Adaptive trimmed t‐statistics for identifying predominantly high expression in a microarray experiment , 2011, Statistics in medicine.

[49]  T. Barrette,et al.  Oncomine 3.0: genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles. , 2007, Neoplasia.

[50]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[51]  T Visakorpi,et al.  Molecular genetics of prostate cancer. , 2001, Annals of medicine.

[52]  S. Eschrich,et al.  The gene expression profiles of primary and metastatic melanoma yields a transition point of tumor progression and metastasis , 2008, BMC Medical Genomics.

[53]  Mark A. Ragan,et al.  mCOPA : analysis of heterogeneous features in cancer expression data , 2013 .

[54]  John V Heymach,et al.  The SUMO E3-ligase PIAS1 regulates the tumor suppressor PML and its oncogenic counterpart PML-RARA. , 2012, Cancer research.