Sparse expression bases in cancer reveal tumor drivers

We define a new category of candidate tumor drivers in cancer genome evolution: ‘selected expression regulators’ (SERs)—genes driving dysregulated transcriptional programs in cancer evolution. The SERs are identified from genome-wide tumor expression data with a novel method, namely SPARROW (SPARse selected expRessiOn regulators identified With penalized regression). SPARROW uncovers a previously unknown connection between cancer expression variation and driver events, by using a novel sparse regression technique. Our results indicate that SPARROW is a powerful complementary approach to identify candidate genes containing driver events that are hard to detect from sequence data, due to a large number of passenger mutations and lack of comprehensive sequence information from a sufficiently large number of samples. SERs identified by SPARROW reveal known driver mutations in multiple human cancers, along with known cancer-associated processes and survival-associated genes, better than popular methods for inferring gene expression networks. We demonstrate that when applied to acute myeloid leukemia expression data, SPARROW identifies an apoptotic biomarker (PYCARD) for an investigational drug obatoclax. The PYCARD and obatoclax association is validated in 30 AML patient samples.

[1]  Xiaobo Zhou,et al.  A novel missense-mutation-related feature extraction scheme for 'driver' mutation identification , 2012, Bioinform..

[2]  Bas J. Wouters,et al.  Brief Report Results and Discussion , 2022 .

[3]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[4]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[5]  A. Gonzalez-Perez,et al.  Functional impact bias reveals cancer drivers , 2012, Nucleic acids research.

[6]  David A. Drubin,et al.  Learning a Prior on Regulatory Potential from eQTL Data , 2009, PLoS genetics.

[7]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[8]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[9]  S. Grant,et al.  Inhibition of Bcl-2 antiapoptotic members by obatoclax potently enhances sorafenib-induced apoptosis in human myeloid leukemia cells through a Bim-dependent process. , 2012, Blood.

[10]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[11]  Benjamin A. Logsdon,et al.  A novel variational Bayes multiple locus Z-statistic for genome-wide association studies with Bayesian model averaging , 2012, Bioinform..

[12]  A. Strasser,et al.  The BCL-2 protein family: opposing activities that mediate cell death , 2008, Nature Reviews Molecular Cell Biology.

[13]  Torsten Haferlach,et al.  An international standardization programme towards the application of gene expression profiling in routine leukaemia diagnostics: the Microarray Innovations in LEukemia study prephase , 2008, British journal of haematology.

[14]  K. Mills,et al.  Transcriptional dysregulation mediated by RUNX1-RUNX1T1 in normal human progenitor cells and in acute myeloid leukaemia , 2007, Leukemia.

[15]  Raul Rabadan,et al.  The integrated landscape of driver genomic alterations in glioblastoma , 2013, Nature Genetics.

[16]  Benjamin A. Logsdon,et al.  Mouse obesity network reconstruction with a variational Bayes algorithm to employ aggressive false positive control , 2012, BMC Bioinformatics.

[17]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[18]  Doron Lancet,et al.  MalaCards: an integrated compendium for diseases and their annotation , 2013, Database J. Biol. Databases Curation.

[19]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[20]  G. Evan,et al.  Proliferation, cell cycle and apoptosis in cancer , 2001, Nature.

[21]  John Calvin Reed,et al.  The CARD plays a critical role in ASC foci formation and inflammasome signalling. , 2013, The Biochemical journal.

[22]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[23]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[24]  J C Reed,et al.  Comparison of chemical inhibitors of antiapoptotic Bcl-2-family proteins , 2006, Cell Death and Differentiation.

[25]  A. Sparks,et al.  The Genomic Landscapes of Human Breast and Colorectal Cancers , 2007, Science.

[26]  Hans Clevers,et al.  The cancer stem cell: premises, promises and challenges , 2011, Nature Medicine.

[27]  R. Myers,et al.  Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data , 2005, Nucleic acids research.

[28]  L. Bullinger,et al.  High BRE expression predicts favorable outcome in adult acute myeloid leukemia, in particular among MLL-AF9-positive patients. , 2011, Blood.

[29]  Matthew B. Callaway,et al.  MuSiC: Identifying mutational significance in cancer genomes , 2012, Genome research.

[30]  Guy Cavet,et al.  Comment on "The Consensus Coding Sequences of Human Breast and Colorectal Cancers" , 2007, Science.

[31]  Or Zuk,et al.  Identification of transcriptional regulators in the mouse immune system , 2013, Nature Immunology.

[32]  M. Konopleva,et al.  Mechanisms of antileukemic activity of the novel Bcl-2 homology domain-3 mimetic GX15-070 (obatoclax). , 2008, Cancer research.

[33]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[34]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[35]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[36]  J. P. Hou,et al.  DawnRank: discovering personalized driver genes in cancer , 2014, Genome Medicine.

[37]  Leyla Isik,et al.  Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. , 2009, Cancer research.

[38]  Alissa M. Weaver,et al.  Tumor Morphology and Phenotypic Evolution Driven by Selective Pressure from the Microenvironment , 2006, Cell.

[39]  J. Salk Clonal evolution in cancer , 2010 .

[40]  M. Bonini,et al.  Nitric oxide cell signaling: S-nitrosation of Ras superfamily GTPases. , 2007, Cardiovascular research.

[41]  Ulrich Mansmann,et al.  An 86-probe-set gene-expression signature predicts survival in cytogenetically normal acute myeloid leukemia. , 2008, Blood.

[42]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[43]  Tatiana A. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2004, Nucleic Acids Res..

[44]  R. Bast,et al.  The role of p27(Kip1) in dasatinib-enhanced paclitaxel cytotoxicity in human ovarian cancer cells. , 2011, Journal of the National Cancer Institute.

[45]  Benjamin A. Logsdon,et al.  A variational Bayes algorithm for fast and accurate multiple locus genome-wide association analysis , 2010, BMC Bioinformatics.

[46]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[47]  Babak Shahbaba,et al.  A pluripotency signature predicts histologic transformation and influences survival in follicular lymphoma patients. , 2009, Blood.

[48]  Benjamin J. Raphael,et al.  Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. , 2013, The New England journal of medicine.

[49]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[50]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[51]  Joshua F. McMichael,et al.  Clonal evolution in relapsed acute myeloid leukemia revealed by whole genome sequencing , 2011, Nature.

[52]  M. Stephens,et al.  Scalable Variational Inference for Bayesian Variable Selection in Regression, and Its Accuracy in Genetic Association Studies , 2012 .

[53]  Adrian E. Raftery,et al.  mclust Version 4 for R : Normal Mixture Modeling for Model-Based Clustering , Classification , and Density Estimation , 2012 .

[54]  M. Goddard,et al.  LASSO with cross-validation for genomic selection. , 2009, Genetics research.

[55]  Yang Chen,et al.  Time-course network analysis reveals TNF-α can promote G1/S transition of cell cycle in vascular endothelial cells , 2012, Bioinform..

[56]  Rakesh Nagarajan,et al.  Somatic mutations and germline sequence variants in the expressed tyrosine kinase genes of patients with de novo acute myeloid leukemia. , 2008, Blood.

[57]  Justin Guinney,et al.  Modeling RAS Phenotype in Colorectal Cancer Uncovers Novel Molecular Traits of RAS Dependency and Improves Prediction of Response to Targeted Agents in Patients , 2013, Clinical Cancer Research.

[58]  Daphne Koller,et al.  Learning a meta-level prior for feature relevance from multiple related tasks , 2007, ICML '07.

[59]  T. Suda,et al.  Activation of ASC induces apoptosis or necrosis, depending on the cell type, and causes tumor eradication , 2010, Cancer science.

[60]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[61]  A. Hata,et al.  Early hematopoietic zinc finger protein (EHZF), the human homolog to mouse Evi3, is highly expressed in primitive human hematopoietic cells. , 2004, Blood.

[62]  D. Cox Regression Models and Life-Tables , 1972 .

[63]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[64]  Ash A. Alizadeh,et al.  Association of a leukemic stem cell gene expression signature with clinical outcomes in acute myeloid leukemia. , 2010, JAMA.

[65]  R. Nielsen Molecular signatures of natural selection. , 2005, Annual review of genetics.

[66]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[67]  Mingming Jia,et al.  COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer , 2010, Nucleic Acids Res..

[68]  D.,et al.  Regression Models and Life-Tables , 2022 .

[69]  A. Dreher Modeling Survival Data Extending The Cox Model , 2016 .

[70]  Tree-guided group lasso for multi-response regression with structured sparsity, with an application to eQTL mapping , 2009, 0909.1373.

[71]  Helga Thorvaldsdóttir,et al.  Molecular signatures database (MSigDB) 3.0 , 2011, Bioinform..

[72]  S. Fulda,et al.  Obatoclax (GX15-070) triggers necroptosis by promoting the assembly of the necrosome on autophagosomal membranes , 2013, Cell Death and Differentiation.

[73]  H. Stunnenberg,et al.  PML-RARalpha/RXR Alters the Epigenetic Landscape in Acute Promyelocytic Leukemia. , 2010, Cancer cell.