A genome-wide approach to link genotype to clinical outcome by utilizing next generation sequencing and gene chip data of 6,697 breast cancer patients

BackgroundThe use of somatic mutations for predicting clinical outcome is difficult because a mutation can indirectly influence the function of many genes, and also because clinical follow-up is sparse in the relatively young next generation sequencing (NGS) databanks. Here we approach this problem by linking sequence databanks to well annotated gene-chip datasets, using a multigene transcriptomic fingerprint as a link between gene mutations and gene expression in breast cancer patients.MethodsThe database consists of 763 NGS samples containing mutational status for 22,938 genes and RNA-seq data for 10,987 genes. The gene chip database contains 5,934 patients with 10,987 genes plus clinical characteristics. For the prediction, mutations present in a sample are first translated into a ‘transcriptomic fingerprint’ by running ROC analysis on mutation and RNA-seq data. Then correlation to survival is assessed by computing Cox regression for both up- and downregulated signatures.ResultsAccording to this approach, the top driver oncogenes having a mutation prevalence over 5 % included AKT1, TRANK1, TRAPPC10, RPGR, COL6A2, RAPGEF4, ATG2B, CNTRL, NAA38, OSBPL10, POTEF, SCLT1, SUN1, VWDE, MTUS2, and PIK3CA, and the top tumor suppressor genes included PHEX, TP53, GGA3, RGS22, PXDNL, ARFGEF1, BRCA2, CHD8, GCC2, and ARMC4. The system was validated by computing correlation between RNA-seq and microarray data (r2 = 0.73, P < 1E-16). Cross-validation using 20 genes with a prevalence of approximately 5 % confirmed analysis reproducibility.ConclusionsWe established a pipeline enabling rapid clinical validation of a discovered mutation in a large breast cancer cohort. An online interface is available for evaluating any human gene mutation or combinations of maximum three such genes (http://www.g-2-o.com).

[1]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[2]  J. Ptak,et al.  High Frequency of Mutations of the PIK3CA Gene in Human Cancers , 2004, Science.

[3]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[4]  J. Weber,et al.  A 360-kb interchromosomal duplication of the human HYDIN locus. , 2006, Genomics.

[5]  Daniel J. Freeman,et al.  Wild-type KRAS is required for panitumumab efficacy in patients with metastatic colorectal cancer. , 2008, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[6]  Z. Szallasi,et al.  Evaluation of Microarray Preprocessing Algorithms Based on Concordance with RT-PCR in Clinical Samples , 2009, PloS one.

[7]  David Haussler,et al.  Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM , 2010, Bioinform..

[8]  K. Flaherty,et al.  Inhibition of mutated, activated BRAF in metastatic melanoma. , 2010, The New England journal of medicine.

[9]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer , 2011, Nature Biotechnology.

[10]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[11]  Z. Szallasi,et al.  RecurrenceOnline: an online analysis tool to determine breast cancer recurrence and hormone receptor status using microarray data , 2012, Breast Cancer Research and Treatment.

[12]  David Venet,et al.  Most Random Gene Expression Signatures Are Significantly Associated with Breast Cancer Outcome , 2011, PLoS Comput. Biol..

[13]  Hadley Wickham,et al.  The Split-Apply-Combine Strategy for Data Analysis , 2011 .

[14]  Zoltan Szallasi,et al.  Jetset: selecting the optimal microarray probe set to represent a gene , 2011, BMC Bioinformatics.

[15]  R. Bernards,et al.  Unresponsiveness of colon cancer to BRAF(V600E) inhibition through feedback activation of EGFR , 2012, Nature.

[16]  A. Børresen-Dale,et al.  The landscape of cancer genes and mutational processes in breast cancer , 2012, Nature.

[17]  Razelle Kurzrock,et al.  PI3K/AKT/mTOR inhibitors in patients with breast and gynecologic malignancies harboring PIK3CA mutations. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[18]  P. Campbell,et al.  Next-generation sequencing in breast cancer: first take home messages , 2012, Current opinion in oncology.

[19]  Joshua F. McMichael,et al.  Whole Genome Analysis Informs Breast Cancer Response to Aromatase Inhibition , 2012, Nature.

[20]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[21]  J. Carpten,et al.  Clonal competition with alternating dominance in multiple myeloma. , 2012, Blood.

[22]  Benjamin E. Gross,et al.  The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. , 2012, Cancer discovery.

[23]  Martin Clynes,et al.  BreastMark: An Integrated Approach to Mining Publicly Available Transcriptomic Datasets Relating to Breast Cancer Outcome , 2013, Breast Cancer Research.

[24]  A. Bashashati,et al.  DriverNet: uncovering the impact of somatic driver mutations on transcriptional networks in cancer , 2012, Genome Biology.

[25]  A. Sivachenko,et al.  Sequence analysis of mutations and translocations across breast cancer subtypes , 2012, Nature.

[26]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[27]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[28]  Jan Budczies,et al.  Online Survival Analysis Software to Assess the Prognostic Value of Biomarkers Using Transcriptomic Data in Non-Small-Cell Lung Cancer , 2013, PloS one.

[29]  A. Sivachenko,et al.  Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples , 2013, Nature Biotechnology.

[30]  Chang S. Chan,et al.  Mutational Landscape of the Essential Autophagy Gene BECN1 in Human Cancers , 2014, Molecular Cancer Research.

[31]  David P. Kreil,et al.  The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance , 2014, Nature Biotechnology.

[32]  P. Stephens,et al.  A targeted next-generation sequencing assay detects a high frequency of therapeutically targetable alterations in primary and metastatic breast cancers: implications for clinical practice. , 2014, The oncologist.

[33]  C. Sotiriou,et al.  TP53 mutation‐correlated genes predict the risk of tumor relapse and identify MPS1 as a potential therapeutic kinase in TP53‐mutated breast cancers , 2014, Molecular oncology.

[34]  A. Di Leo,et al.  Cyclin-dependent kinase 4/6 inhibitors in breast cancer therapy , 2014, Current opinion in oncology.

[35]  Thomas Bachelot,et al.  Comparative genomic hybridisation array and DNA sequencing to direct treatment of metastatic breast cancer: a multicentre, prospective trial (SAFIR01/UNICANCER). , 2014, The Lancet. Oncology.

[36]  L. Pusztai,et al.  Dynamic classification using case-specific training cohorts outperforms static gene expression signatures in breast cancer , 2014, International journal of cancer.

[37]  Mingming Jia,et al.  COSMIC: exploring the world's knowledge of somatic mutations in human cancer , 2014, Nucleic Acids Res..