Statistical Analysis of Pathogenicity of Somatic Mutations in Cancer

Recent large-scale sequencing studies have revealed that cancer genomes contain variable numbers of somatic point mutations distributed across many genes. These somatic mutations most likely include passenger mutations that are not cancer causing and pathogenic driver mutations in cancer genes. Establishing a significant presence of driver mutations in such data sets is of biological interest. Whereas current techniques from phylogeny are applicable to large data sets composed of singly mutated samples, recently exemplified with a p53 mutation database, methods for smaller data sets containing individual samples with multiple mutations need to be developed. By constructing distinct models of both the mutation process and selection pressure upon the cancer samples, exact statistical tests to examine this problem are devised. Tests to examine the significance of selection toward missense, nonsense, and splice site mutations are derived, along with tests assessing variation in selection between functional domains. Maximum-likelihood methods facilitate parameter estimation, including levels of selection pressure and minimum numbers of pathogenic mutations. These methods are illustrated with 25 breast cancers screened across the coding sequences of 518 kinase genes, revealing 90 base substitutions in 71 genes. Significant selection pressure upon truncating mutations was established. Furthermore, an estimated minimum of 29.8 mutations were pathogenic.

[1]  J. Valverde,et al.  RB1 gene mutation up-date, a meta-analysis based on 932 reported mutations available in a searchable database , 2005, BMC Genetics.

[2]  Eugene Berezikov,et al.  CONREAL web server: identification and visualization of conserved transcription factor binding sites , 2005, Nucleic Acids Res..

[3]  Andrew D. Yates,et al.  A screen of the complete protein kinase gene family identifies diverse patterns of somatic mutations in human breast cancer , 2005, Nature Genetics.

[4]  Damian Smedley,et al.  Ensembl 2005 , 2004, Nucleic Acids Res..

[5]  A screen of the complete protein kinase gene family reveals diverse patterns of somatic mutations in human breast cancer , 2005 .

[6]  S. Shang,et al.  The binding of MBL to common bacteria in infectious diseases of children. , 2005, Journal of Zhejiang University. Science. B.

[7]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[8]  B. Rannala,et al.  Likelihood models of somatic mutation and codon substitution in cancer genes. , 2003, Genetics.

[9]  J. Jannink Likelihood of Bayesian, and MCMC Methods in Quantitative Genetics. , 2003 .

[10]  M. Little,et al.  A stochastic carcinogenesis model incorporating genomic instability fitted to colon cancer data. , 2003, Mathematical biosciences.

[11]  Thierry Soussi,et al.  The UMD‐p53 database: New mutations and analysis tools , 2003, Human mutation.

[12]  A. Nicholson,et al.  Mutations of the BRAF gene in human cancer , 2002, Nature.

[13]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[14]  B. Hall,et al.  Spontaneous point mutations that occur more often when advantageous than when neutral. , 1990, Genetics.