Genome-wide patterns and properties of de novo mutations in humans

Mutations create variation in the population, fuel evolution and cause genetic diseases. Current knowledge about de novo mutations is incomplete and mostly indirect. Here we analyze 11,020 de novo mutations from the whole genomes of 250 families. We show that de novo mutations in the offspring of older fathers are not only more numerous but also occur more frequently in early-replicating, genic regions. Functional regions exhibit higher mutation rates due to CpG dinucleotides and show signatures of transcription-coupled repair, whereas mutation clusters with a unique signature point to a new mutational mechanism. Mutation and recombination rates independently associate with nucleotide diversity, and regional variation in human-chimpanzee divergence is only partly explained by heterogeneity in mutation rate. Finally, we provide a genome-wide mutation rate map for medical and population genetics applications. Our results provide new insights and refine long-standing hypotheses about human mutagenesis.

[1]  D. Haussler,et al.  Aligning multiple genomic sequences with the threaded blockset aligner. , 2004, Genome research.

[2]  S. Pääbo,et al.  A neutral explanation for the correlation of diversity with recombination rates in humans. , 2003, American journal of human genetics.

[3]  Martin J Lercher,et al.  Human SNP variability and mutation rate are higher in regions of high recombination. , 2002, Trends in genetics : TIG.

[4]  A. Børresen-Dale,et al.  Mutational Processes Molding the Genomes of 21 Breast Cancers , 2012, Cell.

[5]  P. Visscher,et al.  Interpreting the role of de novo protein-coding mutations in neuropsychiatric disease , 2013, Nature Genetics.

[6]  Paz Polak,et al.  Differential relationship of DNA replication timing to different forms of human mutation and variation. , 2012, American journal of human genetics.

[7]  Terence Hwa,et al.  Distinct changes of genomic biases in nucleotide substitution at the time of Mammalian radiation. , 2003, Molecular biology and evolution.

[8]  P. Green,et al.  Widespread Genomic Signatures of Natural Selection in Hominid Evolution , 2009, PLoS genetics.

[9]  D. Gordenin,et al.  The choice of nucleotide inserted opposite abasic sites formed within chromosomal DNA reveals the polymerase activities participating in translesion DNA synthesis. , 2013, DNA repair.

[10]  Mikhail A. Roytberg,et al.  Analysis of Sequence Conservation at Nucleotide Resolution , 2007, PLoS Comput. Biol..

[11]  Washington Seattle An integrated encyclopedia of DNA elements in the human genome , 2016 .

[12]  E. Birney,et al.  Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. , 2008, Genome research.

[13]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[14]  J. Stamatoyannopoulos,et al.  Reduced local mutation density in regulatory DNA of cancer genomes is linked to DNA repair , 2013, Nature Biotechnology.

[15]  D. Gudbjartsson,et al.  A high-resolution recombination map of the human genome , 2002, Nature Genetics.

[16]  Lilia M. Iakoucheva,et al.  Whole-Genome Sequencing in Autism Identifies Hot Spots for De Novo Germline Mutation , 2012, Cell.

[17]  John Novembre,et al.  The influence of genomic context on mutation patterns in the human genome inferred from rare variants , 2013, Genome research.

[18]  E. Birney,et al.  A small cell lung cancer genome reports complex tobacco exposure signatures , 2009, Nature.

[19]  E. Friedberg,et al.  DNA Repair and Mutagenesis , 2006 .

[20]  M. Lynch Rate, molecular spectrum, and consequences of human mutation , 2010, Proceedings of the National Academy of Sciences.

[21]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[22]  A. Koren DNA replication timing: Coordinating genome stability with genome regulation on the X chromosome and beyond , 2014, BioEssays : news and reviews in molecular, cellular and developmental biology.

[23]  A. Gylfason,et al.  Fine-scale recombination rate differences between sexes, populations and individuals , 2010, Nature.

[24]  Alexey S Kondrashov,et al.  Direct estimates of human per nucleotide mutation rates at 20 loci causing mendelian diseases , 2003, Human mutation.

[25]  Steven A. Roberts,et al.  Clustered mutations in yeast and in human cancers can arise from damaged long single-strand DNA regions. , 2012, Molecular cell.

[26]  W. Murphy,et al.  Resolution of the Early Placental Mammal Radiation Using Bayesian Phylogenetics , 2001, Science.

[27]  S. Dalton,et al.  Evolutionarily conserved replication timing profiles predict long-range chromatin interactions and distinguish closely related cell types. , 2010, Genome research.

[28]  Jay Shendure,et al.  Estimating human mutation rate using autozygosity in a founder population , 2012, Nature Genetics.

[29]  S. Schmidt,et al.  Hypermutable Non-Synonymous Sites Are under Stronger Negative Selection , 2008, PLoS genetics.

[30]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[31]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[32]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[33]  C. Aquadro,et al.  Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster , 1992, Nature.

[34]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[35]  Chao Qian,et al.  Population , 1940, State Rankings 2020: A Statistical View of America.

[36]  D. Hartl,et al.  Population genetics of polymorphism and divergence. , 1992, Genetics.

[37]  S. Batzoglou,et al.  Distribution and intensity of constraint in mammalian genomic sequence. , 2005, Genome research.

[38]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[39]  Pieter B. T. Neerincx,et al.  Supplementary Information Whole-genome sequence variation , population structure and demographic history of the Dutch population , 2022 .

[40]  Terence Hwa,et al.  Substantial Regional Variation in Substitution Rates in the Human Genome: Importance of GC Content, Gene Density, and Telomere-Specific Effects , 2005, Journal of Molecular Evolution.

[41]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[42]  Sudhir Kumar,et al.  Neutral substitutions occur at a faster rate in exons than in noncoding DNA in primate genomes. , 2003, Genome research.

[43]  J. Veltman,et al.  De novo mutations in human genetic disease , 2012, Nature Reviews Genetics.

[44]  B. Cairns,et al.  Age-Associated Sperm DNA Methylation Alterations: Possible Implications in Offspring Disease Susceptibility , 2014, PLoS genetics.

[45]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[46]  Laurent Duret,et al.  The Impact of Recombination on Nucleotide Substitutions in the Human Genome , 2008, PLoS genetics.

[47]  Alan Hodgkinson,et al.  Variation in the mutation rate across mammalian genomes , 2011, Nature Reviews Genetics.

[48]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[49]  J. Felsenstein,et al.  A Hidden Markov Model approach to variation among sites in rate of evolution. , 1996, Molecular biology and evolution.

[50]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.