Detailed computational study of p53 and p16: using evolutionary sequence analysis and disease-associated mutations to predict the functional consequences of allelic variants

Deciding whether a missense allelic variant affects protein function is important in many contexts. We previously demonstrated that a detailed analysis of p53 intragenic conservation correlates with somatic mutation hotspots. Here we refine these evolutionary studies and expand them to the p16/Ink4a gene. We calculated that in order for ‘absolute conservation’ of a codon across multiple species to achieve P<0.05, the evolutionary substitution database must contain at least 3(M) variants, where M equals the number of codons in the gene. Codons in p53 were divided into high (73% of codons), intermediate (29% of codons), and low (0 codons) likelihood of being mutation hotspots. From a database of 263 somatic missense p16 mutations, we identified only four codons that are mutational hotspots at P<0.05 (8 mutations). However, data on function, structure, and disease association support the conclusion that 11 other codons with ≥5 somatic mutations also likely indicate functionally critical residues, even though P0.05. We calculated p16 evolution using amino acid substitution matrices and nucleotide substitution distances. We looked for evolutionary parameters at each codon that would predict whether missense mutations were disease associated or disrupted function. The current p16 evolutionary substitution database is too small to determine whether observations of ‘absolute conservation’ are statistically significant. Increasing the number of sequences from three to seven significantly improved the predictive value of evolutionary computations. The sensitivity and specificity for conservation scores in predicting disease association of p16 codons is 70–80%. Despite the small p16 sequence database, our calculations of high conservation correctly predicted loss of cell cycle arrest function in 75% of tested codons, and low conservation correctly predicted wild-type function in 80–90% of codons. These data validate our hypothesis that detailed evolutionary analyses help predict the consequences of missense amino-acid variants.

[1]  E. Hovig,et al.  CDKN2A (p16INK4A) somatic and germline mutations , 1996, Human mutation.

[2]  N. Hayward,et al.  Functional reassessment of P16 variants using a transfection‐based assay , 1999, International journal of cancer.

[3]  Z. Yang,et al.  Approximate methods for estimating the pattern of nucleotide substitution and the variation of substitution rates among sites. , 1996, Molecular biology and evolution.

[4]  Y. Ina,et al.  New methods for estimating the numbers of synonymous and nonsynonymous substitutions , 1995, Journal of Molecular Evolution.

[5]  J. Struewing,et al.  A single genetic origin for the G101W CDKN2A mutation in 20 melanoma-prone families. , 2000, American journal of human genetics.

[6]  W. Yarbrough,et al.  Biologic and biochemical analyses of p16(INK4a) mutations from primary tumors. , 1999, Journal of the National Cancer Institute.

[7]  S Henikoff,et al.  Performance evaluation of amino acid substitution matrices , 1993, Proteins.

[8]  M. Miller,et al.  Understanding human disease mutations through the use of interspecific genetic variation. , 2001, Human molecular genetics.

[9]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[10]  R. DePinho,et al.  The INK4A/ARF locus and its two gene products. , 1999, Current opinion in genetics & development.

[11]  D. Cooper,et al.  Determinants of the factor IX mutational spectrum in haemophilia B: an analysis of missense mutations using a multi-domain molecular model of the activated protein , 1994, Human Genetics.

[12]  G. Peters,et al.  Functional evaluation of tumour-specific variants of p16INK4a/CDKN2A: correlation with protein structure information , 1999, Oncogene.

[13]  Wojciech Makalowski,et al.  Evolutionary conservation and somatic mutation hotspot maps of p53: correlation with p53 protein structural and functional features , 1999, Oncogene.

[14]  M. Boguski,et al.  Evolutionary parameters of the transcribed mammalian genome: an analysis of 2,820 orthologous rodent and human sequences. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Philip D. Jeffrey,et al.  Structural basis for inhibition of the cyclin-dependent kinase Cdk6 by the tumour suppressor p16INK4a , 1998, Nature.

[16]  P. Pollock,et al.  Compilation of somatic mutations of the CDKN2 gene in human cancers: Non‐random distribution of base substitutions , 1996, Genes, chromosomes & cancer.