Analysis of 6,515 exomes reveals a recent origin of most human protein-coding variants

Establishing the age of each mutation segregating in contemporary human populations is important to fully understand our evolutionary history 1,2 and will help facilitate the development of new approaches for disease gene discovery 3 . Large-scale surveys of human genetic variation have reported signatures of recent explosive population growth 4-6 , notable for an excess of rare genetic variants, qualitatively suggesting that many mutations arose recently. To more quantitatively assess the distribution of mutation ages, we resequenced 15,336 genes in 6,515 individuals of European (n=4,298) and African (n=2,217) American ancestry and inferred the age of 1,146,401 autosomal single nucleotide variants (SNVs). We estimate that ~73% of all protein-coding SNVs and ~86% of SNVs predicted to be deleterious arose in the past 5,000-10,000 years. The average age of deleterious SNVs varied significantly across molecular pathways, and disease

[1]  Jacob A. Tennessen,et al.  Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes , 2012, Science.

[2]  Claudio J. Verzilli,et al.  An Abundance of Rare Functional Variants in 202 Drug Target Genes Sequenced in 14,002 People , 2012, Science.

[3]  A. Clark,et al.  Recent Explosive Human Population Growth Has Resulted in an Excess of Rare Genetic Variants , 2012, Science.

[4]  K. Kwack,et al.  LAMC1 gene is associated with premature ovarian failure. , 2012, Maturitas.

[5]  G. Gibson Rare and common variants: twenty arguments , 2012, Nature Reviews Genetics.

[6]  Gabor T. Marth,et al.  Demographic history and rare allele sharing among human populations , 2011, Proceedings of the National Academy of Sciences.

[7]  Serafim Batzoglou,et al.  Identifying a High Fraction of the Human Genome to be under Selective Constraint Using GERP++ , 2010, PLoS Comput. Biol..

[8]  Taylor J. Maxwell,et al.  Deep resequencing reveals excess rare recent variants consistent with explosive population growth , 2010, Nature communications.

[9]  Jana Marie Schwarz,et al.  MutationTaster evaluates disease-causing potential of sequence alterations , 2010, Nature Methods.

[10]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[11]  Ryan D. Hernandez,et al.  Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data , 2009, PLoS genetics.

[12]  K. Cao,et al.  Association of the mutation for the human carboxypeptidase E gene exon 4 with the severity of coronary artery atherosclerosis , 2009, Molecular Biology Reports.

[13]  Kosuke M. Teshima,et al.  Natural Selection on Genes that Underlie Human Disease Susceptibility , 2008, Current Biology.

[14]  Ryan D. Hernandez,et al.  Proportionally more deleterious genetic variation in European than in African populations , 2008, Nature.

[15]  Robert K. Moyzis,et al.  Recent acceleration of human adaptive evolution , 2007, Proceedings of the National Academy of Sciences.

[16]  L. Muglia,et al.  Amyloid Precursor Protein Regulates Brain Apolipoprotein E and Cholesterol Metabolism through Lipoprotein Receptor LRP1 , 2007, Neuron.

[17]  Ben-Yang Liao,et al.  Impacts of gene essentiality, expression pattern, and gene compactness on the evolutionary rate of mammalian proteins. , 2006, Molecular biology and evolution.

[18]  S. Gabriel,et al.  Calibrating a coalescent simulation of human genome sequence variation. , 2005, Genome research.

[19]  N. Campbell Genetic association database , 2004, Nature Reviews Genetics.

[20]  Sarah A Tishkoff,et al.  Patterns of human genetic diversity: implications for human evolutionary history and disease. , 2003, Annual review of genomics and human genetics.

[21]  M. Slatkin,et al.  Estimating allele age. , 2003, Annual review of genomics and human genetics.

[22]  T. Ohta,et al.  The age of a neutral mutant persisting in a finite population. , 1973, Genetics.

[23]  Sung-Oh Chun,et al.  Identification of deleterious mutations within three human genomes Material Supplemental , 2015 .

[24]  K. Pollard,et al.  Detection of nonneutral substitution rates on mammalian phylogenies. , 2010, Genome research.

[25]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[26]  M. Zatz,et al.  Mutations in the KIAA0196 gene at the SPG8 locus cause hereditary spastic paraplegia. , 2007, American journal of human genetics.

[27]  S. Tavaré,et al.  The age of a mutation in a general coalescent tree , 1998 .