Characteristics of de novo structural changes in the human genome

Small insertions and deletions (indels) and large structural variations (SVs) are major contributors to human genetic diversity and disease. However, mutation rates and characteristics of de novo indels and SVs in the general population have remained largely unexplored. We report 332 validated de novo structural changes identified in whole genomes of 250 families, including complex indels, retrotransposon insertions, and interchromosomal events. These data indicate a mutation rate of 2.94 indels (1-20 bp) and 0.16 SVs (>20 bp) per generation. De novo structural changes affect on average 4.1 kbp of genomic sequence and 29 coding bases per generation, which is 91 and 52 times more nucleotides than de novo substitutions, respectively. This contrasts with the equal genomic footprint of inherited SVs and substitutions. An excess of structural changes originated on paternal haplotypes. Additionally, we observed a nonuniform distribution of de novo SVs across offspring. These results reveal the importance of different mutational mechanisms to changes in human genome structure across generations.

[1]  Timothy B. Stockwell,et al.  The Diploid Genome Sequence of an Individual Human , 2007, PLoS biology.

[2]  John Wei,et al.  Towards a comprehensive structural variation map of an individual human genome , 2010, Genome Biology.

[3]  Deborah A Nickerson,et al.  De novo rates and selection of large copy number variation. , 2010, Genome research.

[4]  P. Deininger,et al.  Mammalian non-LTR retrotransposons: for better or worse, in sickness and in health. , 2008, Genome research.

[5]  R. Redon,et al.  Relative Impact of Nucleotide and Copy Number Variation on Gene Expression Phenotypes , 2007, Science.

[6]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer , 2011, Nature Biotechnology.

[7]  S. Scherer,et al.  Detection of clinically relevant genetic variants in autism spectrum disorder by whole-genome sequencing. , 2013, American journal of human genetics.

[8]  D. Batista,et al.  Molecular analysis of a complex chromosomal rearrangement and a review of familial cases. , 1994, American journal of medical genetics.

[9]  E. Eichler,et al.  A genome-wide comparison of recent chimpanzee and human segmental duplications , 2005, Nature.

[10]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[11]  P. Fortina,et al.  Whole-exome sequencing of DNA from peripheral blood mononuclear cells (PBMC) and EBV-transformed lymphocytes from the same donor , 2011, BMC Genomics.

[12]  M. Lynch Rate, molecular spectrum, and consequences of human mutation , 2010, Proceedings of the National Academy of Sciences.

[13]  Iman Hajirasouliha,et al.  MATE-CLEVER: Mendelian-inheritance-aware discovery and genotyping of midsize and long indels , 2013, Bioinform..

[14]  Mark Gerstein,et al.  The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes , 2013, Genome research.

[15]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[16]  L. Vissers,et al.  Genome sequencing identifies major causes of severe intellectual disability , 2014, Nature.

[17]  M. Gerstein,et al.  CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. , 2011, Genome research.

[18]  Sandra D'Alfonso,et al.  Functional variants in the B-cell gene BANK1 are associated with systemic lupus erythematosus , 2008, Nature Genetics.

[19]  M. DePristo,et al.  Variation in genome-wide mutation rates within and between human families , 2011, Nature Genetics.

[20]  Kenny Q. Ye,et al.  Large-Scale Copy Number Polymorphism in the Human Genome , 2004, Science.

[21]  Kai Ye,et al.  Mobster: accurate detection of mobile element insertions in next generation sequencing data , 2014, Genome Biology.

[22]  J. Veltman,et al.  De novo mutations in human genetic disease , 2012, Nature Reviews Genetics.

[23]  Pieter B. T. Neerincx,et al.  Supplementary Information Whole-genome sequence variation , population structure and demographic history of the Dutch population , 2022 .

[24]  Andy Wing Chun Pang,et al.  Mechanisms of Formation of Structural Variation in a Fully Sequenced Human Genome , 2013, Human mutation.

[25]  Kenny Q. Ye,et al.  Strong Association of De Novo Copy Number Mutations with Autism , 2007, Science.

[26]  L. Vissers,et al.  De novo copy number variants associated with intellectual disability have a paternal origin and age bias , 2011, Journal of Medical Genetics.

[27]  Gregory M. Cooper,et al.  A Copy Number Variation Morbidity Map of Developmental Delay , 2011, Nature Genetics.

[28]  Evan E Eichler,et al.  Properties and rates of germline mutations in humans. , 2013, Trends in genetics : TIG.

[29]  Bradley P. Coe,et al.  Genome structural variation discovery and genotyping , 2011, Nature Reviews Genetics.

[30]  Markus J. van Roosmalen,et al.  Chromothripsis as a mechanism driving complex de novo structural rearrangements in the germline. , 2011, Human molecular genetics.

[31]  M. Tijsterman,et al.  Polymerase theta-mediated end joining of replication-associated DNA breaks in C. elegans , 2014, Genome research.

[32]  Joseph A. Gogos,et al.  Strong association of de novo copy number mutations with sporadic schizophrenia , 2008, Nature Genetics.

[33]  M. McVey,et al.  Synthesis-dependent microhomology-mediated end joining accounts for multiple types of repair junctions , 2010, Nucleic acids research.

[34]  Philip M. Kim,et al.  Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome , 2007, Science.

[35]  Deborah A Nickerson,et al.  Comprehensive identification and characterization of diallelic insertion-deletion polymorphisms in 330 human candidate genes. , 2005, Human molecular genetics.

[36]  Alexey S Kondrashov,et al.  Direct estimates of human per nucleotide mutation rates at 20 loci causing mendelian diseases , 2003, Human mutation.

[37]  B. Coe,et al.  FACADE: a fast and sensitive algorithm for the segmentation and calling of high resolution array CGH data , 2010, Nucleic acids research.

[38]  Jakob Grove,et al.  Novel variation and de novo mutation rates in population-wide de novo assembled Danish trios , 2015, Nature Communications.

[39]  Adrian M. Stütz,et al.  A Comprehensive Map of Mobile Element Insertion Polymorphisms in Humans , 2011, PLoS genetics.

[40]  R. Wilson,et al.  BreakDancer: An algorithm for high resolution mapping of genomic structural variation , 2009, Nature Methods.

[41]  Hugo Y. K. Lam,et al.  Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library , 2010, Nature Biotechnology.

[42]  J. Boeke,et al.  Human Transposon Tectonics , 2012, Cell.

[43]  C. Peschle,et al.  Hereditary thrombophilia: identification of nonsense and missense mutations in the protein C gene. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[44]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[45]  Lilia M. Iakoucheva,et al.  Whole-Genome Sequencing in Autism Identifies Hot Spots for De Novo Germline Mutation , 2012, Cell.

[46]  E. Eichler,et al.  Fine-scale structural variation of the human genome , 2005, Nature Genetics.

[47]  E. Eichler,et al.  A Human Genome Structural Variation Sequencing Resource Reveals Insights into Mutational Mechanisms , 2010, Cell.

[48]  Kenny Q. Ye,et al.  Mapping copy number variation by population scale genome sequencing , 2010, Nature.

[49]  Kai Ye,et al.  Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads , 2009, Bioinform..

[50]  A. Singleton,et al.  Rare Structural Variants Disrupt Multiple Genes in Neurodevelopmental Pathways in Schizophrenia , 2008, Science.

[51]  Joshua M. Korn,et al.  Discovery and genotyping of genome structural polymorphism by sequencing on a population scale , 2011, Nature Genetics.

[52]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.

[53]  Pieter B. T. Neerincx,et al.  The Genome of the Netherlands: design, and project goals , 2013, European Journal of Human Genetics.

[54]  P. Stankiewicz,et al.  Structural variation in the human genome and its role in disease. , 2010, Annual review of medicine.

[55]  M. McVey,et al.  MMEJ repair of double-strand breaks (director's cut): deleted sequences and alternative endings. , 2008, Trends in genetics : TIG.

[56]  J. Buizer-Voskamp,et al.  Genome Arrays for the Detection of Copy Number Variations in Idiopathic Mental Retardation, Idiopathic Generalized Epilepsy and Neuropsychiatric Disorders: Lessons for Diagnostic Workflow and Research , 2011, Cytogenetic and Genome Research.

[57]  R. Durbin,et al.  Revising the human mutation rate: implications for understanding human evolution , 2012, Nature Reviews Genetics.

[58]  J. Lupski,et al.  Mechanisms of change in gene copy number , 2009, Nature Reviews Genetics.

[59]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[60]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[61]  Arthur Wuster,et al.  DeNovoGear: de novo indel and point mutation discovery and phasing , 2013, Nature Methods.

[62]  Jian Wang,et al.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler , 2012, GigaScience.