HIT'nDRIVE: Multi-driver Gene Prioritization Based on Hitting Time

A key challenge in cancer genomics is the identification and prioritization of genomic aberrations that potentially act as drivers of cancer. HIT’nDRIVE is a combinatorial method to identify aberrant genes that can collectively influence possibly distant “outlier” genes based on the “random-walk facility location” (RWFL) problem on an interaction network. RWFL uses “multi-hitting time”, the expected minimum length of a random walk originating from any aberrant gene towards an outlier. HIT’nDRIVE aims to find the smallest set of aberrant genes from which one can reach outliers within desired multi-hitting time. It estimates multi-hitting time based on the independent hitting times and reduces the RWFL to a weighted multi-set cover problem, which it solves as an integer linear program (ILP). We apply HIT’nDRIVE to identify aberrant genes that potentially act as drivers in a cancer data set and make phenotype predictions using only the potential drivers, more accurately than alternative approaches. keywords: drivers, cancer, multi-hitting time, interaction networks, multi-set cover

[1]  E. Wang,et al.  Analysis and design of RNA sequencing experiments for identifying isoform regulation , 2010, Nature Methods.

[2]  Robert H. Bell,et al.  From sequence to molecular pathology, and a mechanism driving the neuroendocrine phenotype in prostate cancer , 2012, The Journal of pathology.

[3]  A. McCullough Comprehensive molecular characterization of human colon and rectal cancer , 2013 .

[4]  David T. W. Jones,et al.  Signatures of mutational processes in human cancer , 2013, Nature.

[5]  Carl W. Cotman,et al.  Gene expression changes in the course of normal brain aging are sexually dimorphic , 2008, Proceedings of the National Academy of Sciences.

[6]  D. Pe’er,et al.  An Integrated Approach to Uncover Drivers of Cancer , 2010, Cell.

[7]  Joshua M. Korn,et al.  Comprehensive genomic characterization defines human glioblastoma genes and core pathways , 2008, Nature.

[8]  Alex Bateman,et al.  Tissue-Specific Splicing of Disordered Segments that Embed Binding Motifs Rewires Protein Interaction Networks , 2012, Molecular cell.

[9]  Akhilesh Pandey,et al.  Human Protein Reference Database and Human Proteinpedia as discovery tools for systems biology. , 2009, Methods in molecular biology.

[10]  Rolf Fimmers,et al.  VARIANT OF THE CHEK2GENE AS A PROGNOSTIC MARKERIN GLIOBLASTOMA MULTIFORME , 2006 .

[11]  Benjamin J. Raphael,et al.  De novo discovery of mutated driver pathways in cancer , 2011 .

[12]  Jeffrey J Meyer,et al.  Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 2012. (5) , 2013 .

[13]  Eli Upfal,et al.  Algorithms for Detecting Significantly Mutated Pathways in Cancer , 2010, RECOMB.

[14]  J. Manley,et al.  Alternative pre-mRNA splicing regulation in cancer: pathways and programs unhinged. , 2010, Genes & development.

[15]  Aaron R. Quinlan,et al.  BamTools: a C++ API and toolkit for analyzing and managing BAM files , 2011, Bioinform..

[16]  David F. Gleich,et al.  Algorithms and Models for the Web Graph , 2014, Lecture Notes in Computer Science.

[17]  A. Bashashati,et al.  DriverNet: uncovering the impact of somatic driver mutations on transcriptional networks in cancer , 2012, Genome Biology.

[18]  Paola Pisani,et al.  Genetic Pathways to Glioblastoma , 2004, Cancer Research.

[19]  Amin Saberi,et al.  On certain connectivity properties of the internet topology , 2006, J. Comput. Syst. Sci..

[20]  Joshua F. McMichael,et al.  Clonal evolution in relapsed acute myeloid leukemia revealed by whole genome sequencing , 2011, Nature.

[21]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[22]  David Haussler,et al.  Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM , 2010, Bioinform..

[23]  Artem Cherkasov,et al.  Distance based algorithms for small biomolecule classification and structural similarity search , 2006, ISMB.

[24]  E. Birney,et al.  Patterns of somatic mutation in human cancer genomes , 2007, Nature.

[25]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[26]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[27]  L. Chin,et al.  Marked genomic differences characterize primary and secondary glioblastoma subtypes and identify two distinct molecular and clinical secondary glioblastoma entities. , 2006, Cancer research.

[28]  J. Uhm An Integrated Genomic Analysis of Human Glioblastoma Multiforme , 2009 .

[29]  E. Birney,et al.  Patterns of somatic mutation in human cancer genomes , 2007, Nature.

[30]  L. Mckerracher,et al.  Identification of myelin-associated glycoprotein as a major myelin-derived inhibitor of neurite growth , 1994, Neuron.

[31]  Carlo C. Maley,et al.  Clonal evolution in cancer , 2012, Nature.

[32]  Prasad Tetali,et al.  Design of on-line algorithms using hitting times , 1994, SODA '94.

[33]  Richard Simon,et al.  Identifying cancer driver genes in tumor genome sequencing studies , 2011, Bioinform..

[34]  R. Fimmers,et al.  VARIANT OF THE CHEK2GENE AS A PROGNOSTIC MARKER IN GLIOBLASTOMA MULTIFORME , 2006, Neurosurgery.

[35]  Benjamin Purow,et al.  Advances in the genetics of glioblastoma: are we reaching critical mass? , 2009, Nature Reviews Neurology.

[36]  Mingming Jia,et al.  COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer , 2010, Nucleic Acids Res..

[37]  S. Rapaport,et al.  Enzymatic evaluation of therapeutic agents in cancer , 1951, Cancer.

[38]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[39]  David Haussler,et al.  Discovering causal pathways linking genomic events to transcriptional states using Tied Diffusion Through Interacting Events (TieDIE) , 2013, Bioinform..

[40]  Elizabeth L. Wilmer,et al.  Markov Chains and Mixing Times , 2008 .

[41]  Johanna Schleutker,et al.  CHEK2 mutations in primary glioblastomas , 2005, Journal of Neuro-Oncology.

[42]  Uriel Feige,et al.  A Tight Lower Bound on the Cover Time for Random Walks on Graphs , 1995, Random Struct. Algorithms.

[43]  M. Stratton,et al.  Statistical Analysis of Pathogenicity of Somatic Mutations in Cancer , 2006, Genetics.

[44]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[45]  Martin Ester,et al.  Optimally discriminative subnetwork markers predict response to chemotherapy , 2011, Bioinform..

[46]  Roded Sharan,et al.  Simultaneous Identification of Multiple Driver Pathways in Cancer , 2013, PLoS Comput. Biol..

[47]  John E. Hopcroft,et al.  Manipulation-Resistant Reputations Using Hitting Time , 2007, WAW.

[48]  S. Gabriel,et al.  Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. , 2010, Cancer cell.

[49]  David L. Masica,et al.  Correlation of somatic mutation and expression identifies genes important in human glioblastoma progression and survival. , 2011, Cancer research.

[50]  E. Eichler,et al.  Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. , 2009, Genome research.

[51]  M. Stratton,et al.  The cancer genome , 2009, Nature.

[52]  E. Domany,et al.  Stem cell-related "self-renewal" signature and high epidermal growth factor receptor expression associated with resistance to concomitant chemoradiotherapy in glioblastoma. , 2008, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[53]  C. Sander,et al.  Mutual exclusivity analysis identifies oncogenic network modules. , 2012, Genome research.

[54]  G. Broggi,et al.  Bone morphogenetic proteins inhibit the tumorigenic potential of human brain tumour-initiating cells , 2006, Nature.

[55]  Teresa M. Przytycka,et al.  Identifying Causal Genes and Dysregulated Pathways in Complex Diseases , 2011, PLoS Comput. Biol..

[56]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[57]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.