OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes

MOTIVATION Gain-of-function mutations often cluster in specific protein regions, a signal that those mutations provide an adaptive advantage to cancer cells and consequently are positively selected during clonal evolution of tumours. We sought to determine the overall extent of this feature in cancer and the possibility to use this feature to identify drivers. RESULTS We have developed OncodriveCLUST, a method to identify genes with a significant bias towards mutation clustering within the protein sequence. This method constructs the background model by assessing coding-silent mutations, which are assumed not to be under positive selection and thus may reflect the baseline tendency of somatic mutations to be clustered. OncodriveCLUST analysis of the Catalogue of Somatic Mutations in Cancer retrieved a list of genes enriched by the Cancer Gene Census, prioritizing those with dominant phenotypes but also highlighting some recessive cancer genes, which showed wider but still delimited mutation clusters. Assessment of datasets from The Cancer Genome Atlas demonstrated that OncodriveCLUST selected cancer genes that were nevertheless missed by methods based on frequency and functional impact criteria. This stressed the benefit of combining approaches based on complementary principles to identify driver mutations. We propose OncodriveCLUST as an effective tool for that purpose. AVAILABILITY OncodriveCLUST has been implemented as a Python script and is freely available from http://bg.upf.edu/oncodriveclust CONTACT nuria.lopez@upf.edu or abel.gonzalez@upf.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  G. Cavet,et al.  Inferring the functional effects of mutation through clusters of mutations in homologous proteins , 2010, Human mutation.

[2]  W. Alkema,et al.  BioVenn – a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams , 2008, BMC Genomics.

[3]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[4]  Jingjing Ye,et al.  Statistical method on nonrandom clustering with application to somatic mutations in cancer , 2010, BMC Bioinformatics.

[5]  M. Barbacid,et al.  RAS oncogenes: the first 30 years , 2003, Nature Reviews Cancer.

[6]  E. Birney,et al.  Patterns of somatic mutation in human cancer genomes , 2007, Nature.

[7]  William Amos,et al.  Even small SNP clusters are non-randomly distributed: is this evidence of mutational non-independence? , 2010, Proceedings of the Royal Society B: Biological Sciences.

[8]  A. Gonzalez-Perez,et al.  Functional impact bias reveals cancer drivers , 2012, Nucleic acids research.

[9]  A. McCullough Comprehensive genomic characterization of squamous cell lung cancers , 2013 .

[10]  Nuria Lopez-Bigas,et al.  IntOGen: integration and data mining of multidimensional oncogenomic data , 2010, Nature Methods.

[11]  Matthew B. Callaway,et al.  MuSiC: Identifying mutational significance in cancer genomes , 2012, Genome research.

[12]  Nuria Lopez-Bigas,et al.  Gitools: Analysis and Visualisation of Genomic Data Using Interactive Heat-Maps , 2011, PloS one.

[13]  Gary D Bader,et al.  Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers , 2013 .

[14]  Guy Cavet,et al.  Comment on "The Consensus Coding Sequences of Human Breast and Colorectal Cancers" , 2007, Science.

[15]  B. Park,et al.  Mutation of the PIK3CA oncogene in human cancers , 2006, British Journal of Cancer.

[16]  E. Birney,et al.  Patterns of somatic mutation in human cancer genomes , 2007, Nature.

[17]  Steven A. Roberts,et al.  Clustered mutations in yeast and in human cancers can arise from damaged long single-strand DNA regions. , 2012, Molecular cell.

[18]  S. De,et al.  DNA replication timing and higher-order nuclear organization determine single nucleotide substitution patterns in cancer genomes , 2013, Nature Communications.

[19]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[20]  A. Sivachenko,et al.  A Landscape of Driver Mutations in Melanoma , 2012, Cell.

[21]  Mingming Jia,et al.  COSMIC (the Catalogue of Somatic Mutations in Cancer): a resource to investigate acquired mutations in human cancer , 2009, Nucleic Acids Res..

[22]  Steven J. M. Jones,et al.  Integrated genomic characterization of endometrial carcinoma , 2013, Nature.

[23]  J. Uhm Comprehensive genomic characterization defines human glioblastoma genes and core pathways , 2009 .

[24]  Andreas Wagner,et al.  Rapid Detection of Positive Selection in Genes and Genomes Through Variation Clusters , 2007, Genetics.

[25]  Jose M. Duarte,et al.  The structural impact of cancer-associated missense mutations in oncogenes and tumor suppressors , 2011, Molecular Cancer.

[26]  Nicholas M. Luscombe,et al.  Evidence of non-random mutation rates suggests an evolutionary risk management strategy , 2012, Nature.

[27]  David Tamborero,et al.  Oncodrive-CIS: A Method to Reveal Likely Driver Genes Based on the Impact of Their Copy Number Changes on Expression , 2013, PloS one.