Combined analysis of gene expression, DNA copy number, and mutation profiling data to display biological process anomalies in individual breast cancers

The goal of this analysis was to develop a computational tool that integrates the totality of gene expression, DNA copy number, and sequence abnormalities in individual cancers in the framework of biological processes. We used the hierarchical structure of the gene ontology (GO) database to create a reference network and projected mRNA expression, DNA copy number and mutation anomalies detected in single samples into this space. We applied our method to 59 breast cancers where all three types of molecular data were available. Each cancer had a large number of disturbed biological processes. Locomotion, multicellular organismal process, and signal transduction pathways were the most commonly affected GO terms, but the individual molecular events were different from case-to-case. Estrogen receptor-positive and -negative cancers had different repertoire of anomalies. We tested the functional impact of 27 mRNAs that had overexpression in cancer with variable frequency (<2–42 %) using an siRNA screen. Each of these genes inhibited cell growth in at least some of 18 breast cancer cell lines. We developed a free, on-line software tool (http://netgoplot.org) to display the complex genomic abnormalities in individual cancers in the biological framework of the GO biological processes. Each cancer harbored a variable number of pathway anomalies and the individual molecular events that caused an anomaly varied from case-to-case. Our in vitro experiments indicate that rare case-specific molecular abnormalities can play a functional role and driver events may vary from case-to-case depending on the constellation of other molecular anomalies.

[1]  Yuan Qi,et al.  Gene pathways associated with prognosis and chemotherapy sensitivity in molecular subtypes of breast cancer. , 2011, Journal of the National Cancer Institute.

[2]  Derek Y. Chiang,et al.  The landscape of somatic copy-number alteration across human cancers , 2010, Nature.

[3]  Nicholas J. Schork,et al.  Accurate prediction of deleterious protein kinase polymorphisms , 2007, Bioinform..

[4]  Giovanni Parmigiani,et al.  Patient-oriented gene set analysis for cancer mutation data , 2010, Genome Biology.

[5]  Philippe Dessen,et al.  Molecular Characterization of Breast Cancer with High-Resolution Oligonucleotide Comparative Genomic Hybridization Array , 2009, Clinical Cancer Research.

[6]  S. Henikoff,et al.  Predicting the effects of amino acid substitutions on protein function. , 2006, Annual review of genomics and human genetics.

[7]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[8]  David Haussler,et al.  Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM , 2010, Bioinform..

[9]  M. Stratton,et al.  The cancer genome , 2009, Nature.

[10]  Figge Fh,et al.  Cancer research, past, present, and future. , 1953 .

[11]  E. Birney,et al.  Patterns of somatic mutation in human cancer genomes , 2007, Nature.

[12]  C. Sander,et al.  Predicting the functional impact of protein mutations: application to cancer genomics , 2011, Nucleic acids research.

[13]  M. Wigler,et al.  Circular binary segmentation for the analysis of array-based DNA copy number data. , 2004, Biostatistics.

[14]  Steven Henikoff,et al.  SIFT: predicting amino acid changes that affect protein function , 2003, Nucleic Acids Res..

[15]  James Taylor,et al.  Next-generation sequencing data interpretation: enhancing reproducibility and accessibility , 2012, Nature Reviews Genetics.

[16]  S. Gabriel,et al.  High-throughput oncogene mutation profiling in human cancer , 2007, Nature Genetics.

[17]  N. Schork,et al.  Prediction of cancer driver mutations in protein kinases. , 2008, Cancer research.

[18]  Satoru Kawai,et al.  An Algorithm for Drawing General Undirected Graphs , 1989, Inf. Process. Lett..