Linear and non-linear dependencies between copy number aberrations and mRNA expression reveal distinct molecular pathways in breast cancer

BackgroundElucidating the exact relationship between gene copy number and expression would enable identification of regulatory mechanisms of abnormal gene expression and biological pathways of regulation. Most current approaches either depend on linear correlation or on nonparametric tests of association that are insensitive to the exact shape of the relationship. Based on knowledge of enzyme kinetics and gene regulation, we would expect the functional shape of the relationship to be gene dependent and to be related to the gene regulatory mechanisms involved. Here, we propose a statistical approach to investigate and distinguish between linear and nonlinear dependences between DNA copy number alteration and mRNA expression.ResultsWe applied the proposed method to DNA copy numbers derived from Illumina 109 K SNP-CGH arrays (using the log R values) and expression data from Agilent 44 K mRNA arrays, focusing on commonly aberrated genomic loci in a collection of 102 breast tumors. Regression analysis was used to identify the type of relationship (linear or nonlinear), and subsequent pathway analysis revealed that genes displaying a linear relationship were overall associated with substantially different biological processes than genes displaying a nonlinear relationship. In the group of genes with a linear relationship, we found significant association to canonical pathways, including purine and pyrimidine metabolism (for both deletions and amplifications) as well as estrogen metabolism (linear amplification) and BRCA-related response to damage (linear deletion). In the group of genes displaying a nonlinear relationship, the top canonical pathways were specific pathways like PTEN and PI13K/AKT (nonlinear amplification) and Wnt(B) and IL-2 signalling (nonlinear deletion). Both amplifications and deletions pointed to the same affected pathways and identified cancer as the top significant disease and cell cycle, cell signaling and cellular development as significant networks.ConclusionsThis paper presents a novel approach to assessing the validity of the dependence of expression data on copy number data, and this approach may help in identifying the drivers of carcinogenesis.

[1]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[2]  Ingrid K. Glad,et al.  CGH-Explorer: a program for analysis of array-CGH data , 2005, Bioinform..

[3]  Adam M. Gustafson,et al.  An integration of complementary strategies for gene-expression analysis to reveal novel therapeutic opportunities for breast cancer , 2009, Breast Cancer Research.

[4]  K. Gunderson,et al.  Comparison of the Agilent, ROMA/NimbleGen and Illumina platforms for classification of copy number alterations in human breast tumors , 2008, BMC Genomics.

[5]  K. Gunderson,et al.  SNP-CGH technologies for genomic profiling of LOH and copy number , 2006 .

[6]  Mei He,et al.  Cancer development and progression. , 2007, Advances in experimental medicine and biology.

[7]  Martin Schäfer,et al.  Integrated analysis of copy number alterations and gene expression: a bivariate assessment of equally directed abnormalities , 2009, Bioinform..

[8]  Peter J. Park,et al.  Integrative analysis reveals the direct and indirect interactions between DNA copy number aberrations and gene expression changes , 2008, Bioinform..

[9]  Raj Chari,et al.  An integrative multi-dimensional genetic and epigenetic strategy to identify aberrant genes and pathways in cancer , 2010, BMC Systems Biology.

[10]  M. Ringnér,et al.  Impact of DNA amplification on gene expression patterns in breast cancer. , 2002, Cancer research.

[11]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[12]  D. Pe’er,et al.  An Integrated Approach to Uncover Drivers of Cancer , 2010, Cell.

[13]  Wessel N van Wieringen,et al.  Nonparametric Testing for DNA Copy Number Induced Differential mRNA Gene Expression , 2009, Biometrics.

[14]  H. Akaike A new look at the statistical model identification , 1974 .

[15]  Christian A. Rees,et al.  Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[16]  A. Ashworth,et al.  An integrative genomic and transcriptomic analysis reveals molecular pathways and networks regulated by copy number aberrations in basal-like, HER2 and luminal cancers , 2010, Breast Cancer Research and Treatment.

[17]  R. Tibshirani,et al.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Therese Sørlie,et al.  Presence of bone marrow micrometastasis is associated with different recurrence risk within molecular subtypes of breast cancer , 2007, Molecular oncology.

[19]  H. Akaike Stochastic theory of minimal realization , 1974 .

[20]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[21]  Hugo M. Horlings,et al.  Integrative molecular profiling of triple negative breast cancers identifies amplicon drivers and potential therapeutic targets , 2009, Oncogene.

[22]  S. Tavaré,et al.  High-resolution aCGH and expression profiling identifies a novel genomic subtype of ER negative breast cancer , 2007, Genome Biology.