Lessons from a decade of integrating cancer copy number alterations with gene expression profiles

Over the last decade, multiple functional genomic datasets studying chromosomal aberrations and their downstream effects on gene expression have accumulated for several cancer types. A vast majority of them are in the form of paired gene expression profiles and somatic copy number alterations (CNA) information on the same patients identified using microarray platforms. In response, many algorithms and software packages are available for integrating these paired data. Surprisingly, there has been no serious attempt to review the currently available methodologies or the novel insights brought using them. In this work, we discuss the quantitative relationships observed between CNA and gene expression in multiple cancer types and biological milestones achieved using the available methodologies. We discuss the conceptual evolution of both, the step-wise and the joint data integration methodologies over the last decade. We conclude by providing suggestions for building efficient data integration methodologies and asking further biological questions.

[1]  Renée X. de Menezes,et al.  Integrated analysis of DNA copy number and gene expression microarray data using gene sets , 2009, BMC Bioinformatics.

[2]  R. Redon,et al.  Relative Impact of Nucleotide and Copy Number Variation on Gene Expression Phenotypes , 2007, Science.

[3]  Derek Y. Chiang,et al.  Focal gains of VEGFA and molecular classification of hepatocellular carcinoma. , 2008, Cancer research.

[4]  D. Pe’er,et al.  An Integrated Approach to Uncover Drivers of Cancer , 2010, Cell.

[5]  I Tomlinson,et al.  APC mutations are sufficient for the growth of early colorectal adenomas. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[6]  H. Varmus,et al.  Homogeneously staining chromosomal regions contain amplified copies of an abundantly expressed cellular oncogene (c-myc) in malignant neuroendocrine cells from a human colon carcinoma. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Wessel N van Wieringen,et al.  Nonparametric Testing for DNA Copy Number Induced Differential mRNA Gene Expression , 2009, Biometrics.

[8]  C. Maley,et al.  Cancer is a disease of clonal evolution within the body1–3. This has profound clinical implications for neoplastic progression, cancer prevention and cancer therapy. Although the idea of cancer as an evolutionary problem , 2006 .

[9]  P. N. Rao,et al.  Clinical Resistance to STI-571 Cancer Therapy Caused by BCR-ABL Gene Mutation or Amplification , 2001, Science.

[10]  P. Duesberg,et al.  Aneuploidy correlated 100% with chemical transformation of Chinese hamster cells. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Thomas D. Wu,et al.  Molecular subclasses of high-grade glioma predict prognosis, delineate a pattern of disease progression, and resemble stages in neurogenesis. , 2006, Cancer cell.

[12]  Cestmir Vlcek,et al.  Ptprj is a candidate for the mouse colon-cancer susceptibility locus Scc1 and is frequently deleted in human cancers , 2002, Nature Genetics.

[13]  W Godolphin,et al.  Studies of the HER-2/neu proto-oncogene in human breast and ovarian cancer. , 1989, Science.

[14]  P. Duesberg,et al.  Aneuploidy vs. gene mutation hypothesis of cancer: recent study claims mutation but is found to support aneuploidy. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[15]  D. Albertson,et al.  Chromosome aberrations in solid tumors , 2003, Nature Genetics.

[16]  B. Ponder,et al.  Detailed deletion mapping of chromosome segment 17q12-21 in sporadic breast tumours. , 1994, Genes, chromosomes & cancer.

[17]  Michael L. Bittner,et al.  Comprehensive copy number and gene expression profiling of the 17q23 amplicon in human breast cancer , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Philippe Besse,et al.  Highlighting relationships between heterogeneous biological data through graphical displays based on regularized canonical correlation analysis , 2009 .

[19]  M. Seto,et al.  Genome‐wide analysis of DNA copy number alterations and gene expression in gastric cancer , 2008, The Journal of pathology.

[20]  M. Ringnér,et al.  Segmentation-based detection of allelic imbalance and loss-of-heterozygosity in cancer cells using whole genome SNP arrays , 2008, Genome Biology.

[21]  R T Schimke,et al.  Gene amplification and drug resistance in cultured murine cells. , 1978, Science.

[22]  Trevor Hastie,et al.  Discovery of molecular subtypes in leiomyosarcoma through integrative molecular profiling , 2009, Oncogene.

[23]  Wonshik Han,et al.  CAMK1D amplification implicated in epithelial–mesenchymal transition in basal‐like breast cancer , 2008, Molecular oncology.

[24]  Yuri Kotliarov,et al.  Correlation analysis between single-nucleotide polymorphism and expression arrays in gliomas identifies potentially relevant target genes. , 2009, Cancer research.

[25]  R. Siebert,et al.  Combined single nucleotide polymorphism-based genomic mapping and global gene expression profiling identifies novel chromosomal imbalances, mechanisms and candidate genes important in the pathogenesis of T-cell prolymphocytic leukemia with inv(14)(q11q32) , 2007, Leukemia.

[26]  Christian A. Rees,et al.  Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Zora Modrusan,et al.  Identification of IGF2 signaling through phosphoinositide-3-kinase regulatory subunit 3 as a growth-promoting axis in glioblastoma , 2007, Proceedings of the National Academy of Sciences.

[28]  Pedro Martínez,et al.  Identification of novel candidate target genes in amplicons of Glioblastoma multiforme tumors detected by expression and CGH microarray profiling , 2006, Molecular Cancer.

[29]  Wessel N. van Wieringen,et al.  CGHregions: Dimension Reduction for Array CGH Data with Minimal Information Loss , 2007 .

[30]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[31]  P. Duesberg,et al.  How aneuploidy affects metabolic control and causes cancer. , 1999, The Biochemical journal.

[32]  Paul Medvedev,et al.  Computational methods for discovering structural variation with next-generation sequencing , 2009, Nature Methods.

[33]  M. A. Wiel,et al.  Nonparametric testing for DNA copy number induced differential mRNA gene expression. , 2009 .

[34]  J Isola,et al.  Androgen receptor gene amplification: a possible molecular mechanism for androgen deprivation therapy failure in prostate cancer. , 1997, Cancer research.

[35]  Jonathan Pevsner,et al.  Gene expression alterations over large chromosomal regions in cancers include multiple genes unrelated to malignant progression. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[36]  S. Goodman,et al.  Evidence that genetic instability occurs at an early stage of colorectal tumorigenesis. , 2001, Cancer research.

[37]  C. Sander,et al.  Integrative genomic profiling of human prostate cancer. , 2010, Cancer cell.

[38]  Jeroen Beliën,et al.  ACE-it: a tool for genome-wide integration of gene dosage and RNA expression data , 2006, Bioinform..

[39]  M. Ringnér,et al.  Impact of DNA amplification on gene expression patterns in breast cancer. , 2002, Cancer research.

[40]  J. A. Berger,et al.  Jointly analyzing gene expression and copy number data in breast cancer using data reduction models , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[41]  Charlotte Soneson,et al.  Integrative analysis of gene expression and copy number alterations using canonical correlation analysis , 2010, BMC Bioinformatics.

[42]  Yi Zhang,et al.  Copy number alterations that predict metastatic capability of human breast cancer. , 2009, Cancer research.

[43]  Ajay N. Jain,et al.  Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. , 2006, Cancer cell.

[44]  Joshy George,et al.  Integrated Genome-Wide DNA Copy Number and Expression Analysis Identifies Distinct Mechanisms of Primary Chemoresistance in Ovarian Carcinomas , 2009, Clinical Cancer Research.

[45]  Jun Yao,et al.  Combined cDNA array comparative genomic hybridization and serial analysis of gene expression analysis of breast tumor progression. , 2006, Cancer research.

[46]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[47]  Franck Picard,et al.  Preprocessing and downstream analysis of microarray DNA copy number profiles , 2011, Briefings Bioinform..

[48]  Robert Tibshirani,et al.  DR-Integrator: a new analytic tool for integrating DNA copy number and gene expression data , 2009, Bioinform..

[49]  S. Knuutila,et al.  Integrated gene copy number and expression microarray analysis of gastric cancer highlights potential target genes , 2008, International journal of cancer.

[50]  Eun Sung Park,et al.  Identification of potential driver genes in human liver carcinoma by genomewide screening. , 2009, Cancer research.

[51]  Joe W. Gray,et al.  Magellan: A Web Based System for the Integrated Analysis of Heterogeneous Biological Data and Annotations; Application to DNA Copy Number and Expression Data in Ovarian Cancer , 2006, Cancer informatics.

[52]  G. Tseng,et al.  Comparison of gene expression and DNA copy number changes in a murine model of lung cancer , 2006, Genes, chromosomes & cancer.

[53]  Hyungwon Choi,et al.  Global Associations between Copy Number and Transcript mRNA Microarray Data: An Empirical Study , 2008, Cancer informatics.

[54]  K. Gunderson,et al.  High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. , 2006, Genome research.

[55]  W. Linehan,et al.  The consequences of chromosomal aneuploidy on gene expression profiles in a cell line model for prostate carcinogenesis. , 2001, Cancer research.

[56]  Christopher Yau,et al.  Comparing CNV detection methods for SNP arrays. , 2009, Briefings in functional genomics & proteomics.

[57]  Sampsa Hautaniemi,et al.  CNAmet: an R package for integrating copy number, methylation and expression data , 2011, Bioinform..

[58]  Calum MacAulay,et al.  SIGMA2: A system for the integrative genomic multi-dimensional analysis of cancer genomes, epigenomes, and transcriptomes , 2008, BMC Bioinformatics.

[59]  W. Hahn,et al.  Derivation of human tumor cells in vitro without widespread genomic instability. , 2001, Cancer research.

[60]  M. Wigler,et al.  PTEN, a Putative Protein Tyrosine Phosphatase Gene Mutated in Human Brain, Breast, and Prostate Cancer , 1997, Science.

[61]  Yidong Chen,et al.  High-resolution analysis of gene copy number alterations in human prostate cancer using CGH on cDNA microarrays: impact of copy number on gene expression. , 2004, Neoplasia.

[62]  G R Stark,et al.  Gene amplification causes overproduction of the first three enzymes of UMP synthesis in N-(phosphonacetyl)-L-aspartate-resistant hamster cells. , 1979, The Journal of biological chemistry.

[63]  K. Kinzler,et al.  The Genetic Basis of Human Cancer , 1997 .

[64]  Brian H. Dunford-Shore,et al.  Somatic mutations affect key pathways in lung adenocarcinoma , 2008, Nature.

[65]  L. Chin,et al.  High-resolution genomic profiles of human lung cancer. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[66]  M. Marra,et al.  Applications of next-generation sequencing technologies in functional genomics. , 2008, Genomics.

[67]  Adam B. Olshen,et al.  Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis , 2009, Bioinform..

[68]  Martin Schäfer,et al.  Integrated analysis of copy number alterations and gene expression: a bivariate assessment of equally directed abnormalities , 2009, Bioinform..

[69]  D. Pinkel,et al.  Array comparative genomic hybridization and its applications in cancer , 2005, Nature Genetics.

[70]  O. Kallioniemi,et al.  Identification of target genes in laryngeal squamous cell carcinoma by high-resolution copy number and gene expression microarray analyses , 2006, Oncogene.

[71]  S. Rafii,et al.  Splitting vessels: Keeping lymph apart from blood , 2003, Nature Medicine.

[72]  Peter J. Park,et al.  Integrative analysis reveals the direct and indirect interactions between DNA copy number aberrations and gene expression changes , 2008, Bioinform..

[73]  Jean Marx,et al.  Debate Surges Over the Origins of Genomic Defects in Cancer , 2002, Science.

[74]  T. P. Dryja,et al.  Expression of recessive alleles by chromosomal mechanisms in retinoblastoma , 1983, Nature.

[75]  Li-Xuan Qin,et al.  An Integrative Analysis of microRNA and mRNA Expression—A Case Study , 2008, Cancer informatics.

[76]  Sylvia Richardson,et al.  Detection of gene copy number changes in CGH microarrays using a spatially correlated mixture model , 2006, Bioinform..

[77]  Roberta Spinelli,et al.  A computational procedure to identify significant overlap of differentially expressed and genomic imbalanced regions in cancer datasets† , 2009, Nucleic acids research.

[78]  Nallasivam Palanisamy,et al.  Targets of genome copy number reduction in primary breast cancers identified by integrative genomics , 2007, Genes, chromosomes & cancer.

[79]  B. Vogelstein,et al.  p53 gene mutations occur in combination with 17p allelic deletions as late events in colorectal tumorigenesis. , 1990, Cancer research.

[80]  R. Weinberg,et al.  Function of a human cyclin gene as an oncogene. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[81]  Dimitri Semizarov,et al.  Integrative Genomic Analysis of Small-Cell Lung Carcinoma Reveals Correlates of Sensitivity to Bcl-2 Antagonists and Uncovers Novel Chromosomal Gains , 2007, Molecular Cancer Research.

[82]  G. Hannon,et al.  Deletion of the p16 and p15 genes in human bladder tumors. , 1995, Journal of the National Cancer Institute.

[83]  Francesco Bertoni,et al.  Genomic and expression profiling identifies the B‐cell associated tyrosine kinase Syk as a possible therapeutic target in mantle cell lymphoma , 2006, British journal of haematology.

[84]  Zohar Yakhini,et al.  Joint Analysis of DNA Copy Numbers and Gene Expression Levels , 2004, WABI.

[85]  P. Duesberg,et al.  Are cancers dependent on oncogenes or on aneuploidy? , 2003, Cancer genetics and cytogenetics.

[86]  Howard Y. Chang,et al.  Genetic regulators of large-scale transcriptional signatures in cancer , 2006, Nature Genetics.

[87]  T. Golub,et al.  Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma , 2005, Nature.

[88]  Ji Zhu,et al.  Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer. , 2008, The annals of applied statistics.

[89]  Ash A. Alizadeh,et al.  Gene Expression Signature of Fibroblast Serum Response Predicts Human Cancer Progression: Similarities between Tumors and Wounds , 2004, PLoS biology.

[90]  J. Uhm An Integrated Genomic Analysis of Human Glioblastoma Multiforme , 2009 .