De novo mutations across 1,465 diverse genomes reveal novel mutational insights and reductions in the Amish founder population

de novo Mutations (DNMs), or mutations that appear in an individual despite not being seen in their parents, are an important source of genetic variation whose impact is relevant to studies of human evolution, genetics, and disease. Utilizing high-coverage whole genome sequencing data as part of the Trans-Omics for Precision Medicine (TOPMed) program, we directly estimate and analyze DNM counts, rates, and spectra from 1,465 trios across an array of diverse human populations. Using the resulting call set of 86,865 single nucleotide DNMs, we find a significant positive correlation between local recombination rate and local DNM rate, which together can explain up to 35.5% of the genome-wide variation in population level rare genetic variation from 41K unrelated TOPMed samples. While genome-wide heterozygosity does correlate weakly with DNM count, we do not find significant differences in DNM rate between individuals of European, African, and Latino ancestry, nor across ancestrally distinct segments within admixed individuals. However, interestingly, we do find significantly fewer DNMs in Amish individuals compared with other Europeans, even after accounting for parental age and sequencing center. Specifically, we find significant reductions in the number of T→C mutations in the Amish, which seems to underpin their overall reduction in DNMs. Finally, we calculate near-zero estimates of narrow sense heritability (h2), which suggest that variation in DNM rate is significantly shaped by non-additive genetic effects and/or the environment, and that a less mutagenic environment may be responsible for the reduced DNM rate in the Amish. Significance Here we provide one of the largest and most diverse human de novo mutation (DNM) call sets to date, and use it to quantify the genome-wide relationship between local mutation rate and population-level rare genetic variation. While we demonstrate that the human single nucleotide mutation rate is similar across numerous human ancestries and populations, we also discover a reduced mutation rate in the Amish founder population, which shows that mutation rates can shift rapidly. Finally, we find that variation in mutation rates is not heritable, which suggests that the environment may influence mutation rates more significantly than previously realized.

[1]  Brian E. Cade,et al.  Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program , 2019, Nature.

[2]  Hannes P. Eggertsson,et al.  Characterizing mutagenic effects of recombination through a sequence-level genetic map , 2019, Science.

[3]  Brent S. Pedersen,et al.  Overlooked roles of DNA damage and maternal age in generating human germline mutations , 2018, Proceedings of the National Academy of Sciences.

[4]  N. Luscombe,et al.  Nucleosome positioning stability is a significant modulator of germline mutation rate variation across the human genome , 2018 .

[5]  Hannes P. Eggertsson,et al.  Multiple transmissions of de novo mutations in families , 2018, Nature Genetics.

[6]  T. Petes,et al.  GC content elevates mutation and recombination rates in the yeast Saccharomyces cerevisiae , 2018, Proceedings of the National Academy of Sciences.

[7]  T. O’Connor,et al.  Evolutionary genomic dynamics of Peruvians before, during, and after the Inca Empire , 2018, Proceedings of the National Academy of Sciences.

[8]  Yeting Zhang,et al.  Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects , 2018, Nature Communications.

[9]  Pieter B. T. Neerincx,et al.  Germline de novo mutation clusters arise during oocyte aging in genomic regions with high double-strand-break incidence , 2018, Nature Genetics.

[10]  Hannes P. Eggertsson,et al.  Parental influence on human germline de novo mutations in 1,548 trios from Iceland , 2017, Nature.

[11]  J. Veltman,et al.  Germline de novo mutation clusters arise during oocyte aging in genomic regions with increased double-strand break incidence , 2017, bioRxiv.

[12]  H. Kang,et al.  Extremely rare variants reveal patterns of germline mutation rate heterogeneity in humans , 2017, Nature Communications.

[13]  Kelley Harris,et al.  Rapid evolution of the human mutation spectrum , 2016, bioRxiv.

[14]  J. Roach,et al.  Parent-of-origin-specific signatures of de novo mutations , 2016, Nature Genetics.

[15]  Shane A. McCarthy,et al.  Reference-based phasing using the Haplotype Reference Consortium panel , 2016, Nature Genetics.

[16]  Kyle J. Gaulton,et al.  Detection of human adaptation during the past 2000 years , 2016, Science.

[17]  S. Nolt The Amish: A Concise Introduction , 2016 .

[18]  J. Vockley,et al.  New observations on maternal age effect on germline de novo mutations , 2016, Nature Communications.

[19]  M. Harrison A Global Perspective , 2015, Bulletin of the history of medicine.

[20]  John Wakeley,et al.  Leveraging distant relatedness to quantify human mutation and gene conversion rates , 2015, bioRxiv.

[21]  Wei Chen,et al.  Sequence analysis A Bayesian framework for de novo mutation calling in parents-offspring trios , 2015 .

[22]  Kelley Harris Evidence for recent, population-specific evolution of the human mutation rate , 2015, Proceedings of the National Academy of Sciences.

[23]  A. Betancourt,et al.  Crossovers are associated with mutation and biased gene conversion at recombination hotspots , 2015, Proceedings of the National Academy of Sciences.

[24]  Jakob Grove,et al.  Novel variation and de novo mutation rates in population-wide de novo assembled Danish trios , 2015, Nature Communications.

[25]  Paz Polak,et al.  Genetic Variation in Human DNA Replication Timing , 2014, Cell.

[26]  Molly Przeworski,et al.  Determinants of mutation rate variation in the human germline. , 2014, Annual review of genomics and human genetics.

[27]  Stephan J Sanders,et al.  A framework for the interpretation of de novo mutation in human disease , 2014, Nature Genetics.

[28]  Jeffrey R. O'Connell,et al.  Improvement of Prediction Ability for Genomic Selection of Dairy Cattle by Including Dominance Effects , 2014, PloS one.

[29]  David T. W. Jones,et al.  Signatures of mutational processes in human cancer , 2013, Nature.

[30]  C. Bustamante,et al.  RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. , 2013, American journal of human genetics.

[31]  W. Amos Variation in Heterozygosity Predicts Variation in Human Substitution Rates between Populations, Individuals and Genomic Regions , 2013, PloS one.

[32]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[33]  A. Schäffer,et al.  Living the Good Life? Mortality and Hospital Utilization Patterns in the Old Order Amish , 2012, PloS one.

[34]  Paz Polak,et al.  Differential relationship of DNA replication timing to different forms of human mutation and variation. , 2012, American journal of human genetics.

[35]  R. Durbin,et al.  Revising the human mutation rate: implications for understanding human evolution , 2012, Nature Reviews Genetics.

[36]  Jay Shendure,et al.  Estimating human mutation rate using autozygosity in a founder population , 2012, Nature Genetics.

[37]  S. Steinberg,et al.  Rate of de novo mutations and the importance of father’s age to disease risk , 2012, Nature.

[38]  J. Veltman,et al.  De novo mutations in human genetic disease , 2012, Nature Reviews Genetics.

[39]  J. Shendure,et al.  De novo germline and postzygotic mutations in AKT3, PIK3R2 and PIK3CA cause a spectrum of related megalencephaly syndromes , 2012, Nature Genetics.

[40]  M. DePristo,et al.  Variation in genome-wide mutation rates within and between human families , 2011, Nature Genetics.

[41]  M. Lynch Evolution of the mutation rate. , 2010, Trends in genetics : TIG.

[42]  P. Shannon,et al.  Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing , 2010, Science.

[43]  M. Lynch Rate, molecular spectrum, and consequences of human mutation , 2010, Proceedings of the National Academy of Sciences.

[44]  W. Amos Heterozygosity and mutation rate: evidence for an interaction and its implications , 2010, BioEssays : news and reviews in molecular, cellular and developmental biology.

[45]  A. K. Haritash,et al.  Biodegradation aspects of polycyclic aromatic hydrocarbons (PAHs): a review. , 2009, Journal of hazardous materials.

[46]  S. MacEachern,et al.  Low cancer incidence rates in Ohio Amish , 2009, Cancer Causes & Control.

[47]  Naomi R. Wray,et al.  Estimating Trait Heritability , 2008 .

[48]  L Hardell,et al.  Lifestyle-related factors and environmental agents causing cancer: an overview. , 2007, Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie.

[49]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[50]  Kenny Q. Ye,et al.  Strong Association of De Novo Copy Number Mutations with Autism , 2007, Science.

[51]  J. B. S. Haldane,et al.  The rate of spontaneous mutation of a human gene , 1935, Journal of Genetics.

[52]  P. Mahadevan,et al.  An overview , 2007, Journal of Biosciences.

[53]  G. Davey Smith,et al.  Advanced paternal age: How old is too old? , 2006, Journal of Epidemiology and Community Health.

[54]  S. T. Ingle,et al.  Biological monitoring of roadside plants exposed to vehicular pollution in Jalgaon city. , 2006, Journal of environmental biology.

[55]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[56]  Chungsying Lu,et al.  Estimation of PAHs dry deposition and BaP toxic equivalency factors (TEFs) study at Urban, Industry Park and rural sampling sites in central Taiwan, Taichung. , 2004, Chemosphere.

[57]  Wen-Jhy Lee,et al.  Assessing and predicting the exposures of polycyclic aromatic hydrocarbons (PAHs) and their carcinogenic potencies from vehicle engine exhausts to highway toll station workers , 2004 .

[58]  H. Ellegren,et al.  Mutation rate variation in the mammalian genome. , 2003, Current opinion in genetics & development.

[59]  Alexey S Kondrashov,et al.  Direct estimates of human per nucleotide mutation rates at 20 loci causing mendelian diseases , 2003, Human mutation.

[60]  Eric S. Lander,et al.  Human genome sequence variation and the influence of gene history, mutation and recombination , 2002, Nature Genetics.

[61]  Martin J Lercher,et al.  Human SNP variability and mutation rate are higher in regions of high recombination. , 2002, Trends in genetics : TIG.

[62]  Dongquan He,et al.  Assessment of Vehicular Pollution in China , 2001, Journal of the Air & Waste Management Association.

[63]  P. Dolara,et al.  Polycyclic aromatic hydrocarbons in Laurus nobilis leaves as a measure of air pollution in urban and rural sites of Tuscany. , 1998, Chemosphere.

[64]  R. Agarwala,et al.  Software for constructing and verifying pedigrees within large genealogies and an application to the Old Order Amish of Lancaster County. , 1998, Genome research.

[65]  V. Tsihrintzis,et al.  Modeling and Management of Urban Stormwater Runoff Quality: A Review , 1997 .

[66]  F. Tajima The amount of DNA polymorphism maintained in a finite population when the neutral mutation rate varies among sites. , 1996, Genetics.

[67]  P. S. Nielsen,et al.  Exposure to urban and rural air pollution: DNA and protein adducts and effect of glutathione-S-transferase genotype on adduct levels , 1996, International archives of occupational and environmental health.

[68]  P. Raven,et al.  Child, Adolescent and Family Refugee Mental Health: A Global Perspective , 1996 .

[69]  L. Loeb,et al.  Mechanisms of mutation by oxidative DNA damage: reduced fidelity of mammalian DNA polymerase beta. , 1993, Biochemistry.

[70]  Bonnie B. Potocki,et al.  Exposure to carcinogenic PAHs in the environment , 1992 .

[71]  J. I. Barancik,et al.  Patterns of mortality in the the Old Order Amish. I. Background and major causes of death. , 1981, American journal of epidemiology.

[72]  R. C. Macridis A review , 1963 .