A sparse regulatory network of copy-number driven expression reveals putative breast cancer oncogenes

Copy number aberrations are recognized to be important in cancer as they may localize to regions harboring oncogenes or tumor suppressors. Such genomic alterations mediate phenotypic changes through their impact on expression. Both cis- and transacting alterations are important since they may help to elucidate putative cancer genes. However, amidst numerous passenger genes, trans-effects are less well studied due to the computational difficulty in detecting weak and sparse signals in the data, and yet may influence multiple genes on a global scale. We propose an integrative approach to learn a sparse interaction network of DNA copy-number regions with their downstream transcriptional targets in breast cancer. With respect to goodness of fit on both simulated and real data, the performance of sparse network inference is no worse than other state-of-the-art models but with the advantage of simultaneous feature selection and efficiency. The DNA-RNA interaction network helps to distinguish copy-number driven expression alterations from those that are copy-number independent. Further, our approach yields a quantitative copy-number dependency score, which distinguishes cis-versus trans-effects. When applied to a breast cancer data set, numerous expression profiles were impacted by cis-acting copy-number alterations, including several known oncogenes such as GRB7, ERBB2, and LSM1. Several trans-acting alterations were also identified, impacting genes such as ADAM2 and BAGE, which warrant further investigation. Availability: An R package named lol is available from www.markowetzlab.org/software/lol.html.

[1]  J. Juang,et al.  Involvement of Cdk5/p25 in Digoxin-triggered Prostate Cancer Cell Apoptosis* , 2004, Journal of Biological Chemistry.

[2]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[3]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[4]  Ash A. Alizadeh,et al.  Genome-wide analysis of DNA copy-number changes using cDNA microarrays , 1999, Nature Genetics.

[5]  S. Drăghici,et al.  Transforming function of the LSM1 oncogene in human breast cancers with the 8p11–12 amplicon , 2007, Oncogene.

[6]  C. Grunau,et al.  BAGE Hypomethylation, A New Epigenetic Biomarker for Colon Cancer Detection , 2008, Cancer Epidemiology Biomarkers & Prevention.

[7]  Renée X. de Menezes,et al.  Integrated analysis of DNA copy number and gene expression microarray data using gene sets , 2009, BMC Bioinformatics.

[8]  Peter J. Park,et al.  Integrative analysis reveals the direct and indirect interactions between DNA copy number aberrations and gene expression changes , 2008, Bioinform..

[9]  Wessel N. van Wieringen,et al.  CGHregions: Dimension Reduction for Array CGH Data with Minimal Information Loss , 2007 .

[10]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[11]  Keith Wilson,et al.  Silence of chromosomal amplifications in colon cancer. , 2002, Cancer research.

[12]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[13]  M. Doll,et al.  Molecular genetics and epidemiology of the NAT1 and NAT2 acetylation polymorphisms. , 2000, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[14]  M. A. van de Wiel,et al.  CGHregions: Dimension Reduction for Array CGH Data with Minimal Information Loss , 2007, Cancer informatics.

[15]  Vasyl Pihur,et al.  Reconstruction of genetic association networks from microarray data: a partial least squares approach , 2008, Bioinform..

[16]  Bert W O'Malley,et al.  Identification of target genes in breast cancer cells directly regulated by the SRC-3/AIB1 coactivator. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[18]  Marcel J T Reinders,et al.  Imaging , Diagnosis , Prognosis Clinical Cancer Research Integration of DNA Copy Number Alterations and Prognostic Gene Expression Signatures in Breast Cancer Patients , 2010 .

[19]  Kazuyuki Takata,et al.  Cdk5 Is a Key Factor in Tau Aggregation and Tangle Formation In Vivo , 2003, Neuron.

[20]  Kenny Q. Ye,et al.  Novel patterns of genome rearrangement and their association with survival in breast cancer. , 2006, Genome research.

[21]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[22]  T. Hudson,et al.  Elucidating cis- and trans-regulatory variation using genetical genomics. , 2006, Trends in genetics : TIG.

[23]  P. Watson,et al.  CaSm (LSm-1) overexpression in lung cancer and mesothelioma is required for transformed phenotypes. , 2008, American journal of respiratory cell and molecular biology.

[24]  Mahesh C Sharma,et al.  Roscovitine regulates invasive breast cancer cell (MDA-MB231) proliferation and survival through cell cycle regulatory protein cdk5. , 2007, Experimental and molecular pathology.

[25]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[26]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[27]  S. Knuutila,et al.  Integrated gene copy number and expression microarray analysis of gastric cancer highlights potential target genes , 2008, International journal of cancer.

[28]  D. Anderson,et al.  Algorithms for minimization without derivatives , 1974 .

[29]  Ajay N. Jain,et al.  Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. , 2006, Cancer cell.

[30]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[31]  R. Plomin Molecular Genetics and g , 2003 .

[32]  S. Moreno,et al.  Retinoic acid downregulates Rae1 leading to APCCdh1 activation and neuroblastoma SH-SY5Y differentiation , 2008, Oncogene.

[33]  William C Reinhold,et al.  Integrating data on DNA copy number with gene expression levels and drug sensitivities in the NCI-60 cell line panel , 2006, Molecular Cancer Therapeutics.

[34]  R. Wong Interaction between Rae1 and Cohesin subunit SMC1 is required for proper spindle formation , 2010, Cell cycle.

[35]  Ash A. Alizadeh,et al.  Genome-wide analysis of DNA copy number variation in breast cancer using DNA microarrays , 1999, Nature Genetics.

[36]  J. Goeman L1 Penalized Estimation in the Cox Proportional Hazards Model , 2009, Biometrical journal. Biometrische Zeitschrift.

[37]  T. Dörk,et al.  Association of chromosomal locus 8q24 and risk of prostate cancer: a hospital-based study of German patients treated with brachytherapy. , 2009, Urologic oncology.

[38]  R. Brent Table errata: Algorithms for minimization without derivatives (Prentice-Hall, Englewood Cliffs, N. J., 1973) , 1975 .

[39]  Barbara Burwinkel,et al.  Association of NCOA3 Polymorphisms with Breast Cancer Risk , 2005, Clinical Cancer Research.

[40]  Xiaoliang Zhou,et al.  Expression of tumor-specific antigen MAGE, GAGE and BAGE in ovarian cancer tissues and cell lines , 2010, BMC Cancer.