Mechanisms of Formation of Structural Variation in a Fully Sequenced Human Genome

Even with significant advances in technology, few studies of structural variation have yet resolved to the level of the precise nucleotide junction. We examined the sequence of 408,532 gains, 383,804 losses, and 166 inversions from the first sequenced personal genome, to quantify the relative proportion of mutational mechanisms. Among small variants (<1 kb), we observed that 72.6% of them were associated with nonhomologous processes and 24.9% with microsatellites events. Medium‐size variants (<10 kb) were commonly related to minisatellites (25.8%) and retrotransposons (24%), whereas 46.2% of large variants (>10 kb) were associated with nonallelic homologous recombination. We genotyped eight new breakpoint‐resolved inversions at (3q26.1, Xp11.22, 7q11.22, 16q23.1, 4q22.1, 1q31.3, 6q27, and 16q24.1) in human populations to elucidate the structure of these presumed benign variants. Three of these inversions (3q26.1, 7q11.22, and 16q23.1) were accompanied by unexpected complex rearrangements. In particular, the 16q23.1 inversion and an accompanying deletion would create conjoined chymotrypsinogen genes (CTRB1 and CTRB2), disrupt their gene structure, and exhibit differentiated allelic frequencies among populations. Also, two loci (Xp11.3 and 6q27) of potential reference assembly orientation errors were found. This study provides a thorough account of formation mechanisms for structural variants, and reveals a glimpse of the dynamic structure of inversions.

[1]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[2]  Robert Pepperell,et al.  What is a Human? , 1994, Intell. Tutoring Media.

[3]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[4]  H. Bandelt,et al.  Median-joining networks for inferring intraspecific phylogenies. , 1999, Molecular biology and evolution.

[5]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[6]  G. Richard,et al.  Mini‐ and microsatellite expansions: the recombination connection , 2000, EMBO reports.

[7]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[8]  Stephen W. Scherer,et al.  A 1.5 million–base pair inversion polymorphism in families with Williams-Beuren syndrome , 2001, Nature Genetics.

[9]  J. Mullikin,et al.  SSAHA: a fast search method for large DNA databases. , 2001, Genome research.

[10]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[11]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[12]  Jonathan Scott Friedlaender,et al.  A Human Genome Diversity Cell Line Panel , 2002, Science.

[13]  Junjun Zhang,et al.  Human Chromosome 7: DNA Sequence and Biology , 2003, Science.

[14]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[15]  H. Stefánsson,et al.  A common inversion under selection in Europeans , 2005, Nature Genetics.

[16]  L. Feuk,et al.  Discovery of Human Inversion Polymorphisms by Comparative Analysis of Human and Chimpanzee DNA Sequence Assemblies , 2005, PLoS genetics.

[17]  L. Feuk,et al.  Development of bioinformatics resources for display and analysis of copy number and other structural variants in the human genome , 2006, Cytogenetic and Genome Research.

[18]  Wendy Roberts,et al.  Absence of a paternally inherited FOXP2 gene in developmental verbal dyspraxia. , 2006, American journal of human genetics.

[19]  A. Jeffreys,et al.  Processes of copy-number change in human DNA: the dynamics of {alpha}-globin gene deletion. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[20]  R. Redon,et al.  Genome assembly comparison identifies structural variants in the human genome , 2006, Nature Genetics.

[21]  D. Huson,et al.  Application of phylogenetic networks in evolutionary studies. , 2006, Molecular biology and evolution.

[22]  Timothy B. Stockwell,et al.  The Diploid Genome Sequence of an Individual Human , 2007, PLoS biology.

[23]  J. Lupski,et al.  A DNA Replication Mechanism for Generating Nonrecurrent Rearrangements Associated with Genomic Disorders , 2007, Cell.

[24]  Vineet Bafna,et al.  HapCUT: an efficient and accurate algorithm for the haplotype assembly problem , 2008, ECCB.

[25]  A. Tsalenko,et al.  The fine-scale and complex architecture of human copy-number variation. , 2008, American journal of human genetics.

[26]  Timothy B. Stockwell,et al.  Genetic Variation in an Individual Human Exome , 2008, PLoS genetics.

[27]  J. Lupski,et al.  A Microhomology-Mediated Break-Induced Replication Model for the Origin of Human Copy Number Variation , 2009, PLoS genetics.

[28]  E. Kirkness,et al.  Mobile elements create structural variation: analysis of a complete human genome. , 2009, Genome research.

[29]  Zhaoshi Jiang,et al.  Characterization of six human disease-associated inversion polymorphisms , 2009, Human molecular genetics.

[30]  J. Lupski,et al.  The DNA replication FoSTeS/MMBIR mechanism can generate genomic, genic and exonic complex rearrangements in humans , 2009, Nature Genetics.

[31]  P. Stankiewicz,et al.  Complex rearrangements in patients with duplications of MECP2 can occur by fork stalling and template switching. , 2009, Human molecular genetics.

[32]  E. Eichler,et al.  A Human Genome Structural Variation Sequencing Resource Reveals Insights into Mutational Mechanisms , 2010, Cell.

[33]  Miriam K. Konkel,et al.  A mobile threat to genome stability: The impact of non-LTR retrotransposons upon the human genome. , 2010, Seminars in cancer biology.

[34]  John Wei,et al.  Towards a comprehensive structural variation map of an individual human genome , 2010, Genome Biology.

[35]  Hugo Y. K. Lam,et al.  Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library , 2010, Nature Biotechnology.

[36]  Benjamin P. Blackburne,et al.  Mutation spectrum revealed by breakpoint sequencing of human germline CNVs , 2010, Nature Genetics.

[37]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[38]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.

[39]  Stephen R Quake,et al.  Whole-genome molecular haplotyping of single cells , 2011, Nature Biotechnology.

[40]  Andrew C. Adey,et al.  Haplotype-resolved genome sequencing of a Gujarati Indian individual , 2011, Nature Biotechnology.

[41]  Adrian M. Stütz,et al.  A Comprehensive Map of Mobile Element Insertion Polymorphisms in Humans , 2011, PLoS genetics.

[42]  Emmanouil Collab A map of human genome variation from population-scale sequencing , 2011, Nature.

[43]  Tadashi Imanishi,et al.  Abundance of ultramicro inversions within local alignments between human and chimpanzee genomes , 2011, BMC Evolutionary Biology.

[44]  S. De,et al.  DNA replication timing and long-range DNA interactions predict mutational landscapes of cancer genomes , 2011, Nature Biotechnology.

[45]  Anantharaman Kalyanaraman,et al.  Genome Assembly , 2011, Encyclopedia of Parallel Computing.

[46]  Kenny Q. Ye,et al.  Mapping copy number variation by population scale genome sequencing , 2010, Nature.

[47]  Gad Getz,et al.  High-order chromatin architecture determines the landscape of chromosomal alterations in cancer , 2011, Nature Biotechnology.

[48]  Markus J. van Roosmalen,et al.  Chromothripsis as a mechanism driving complex de novo structural rearrangements in the germline. , 2011, Human molecular genetics.

[49]  N. Carter,et al.  Massive Genomic Rearrangement Acquired in a Single Catastrophic Event during Cancer Development , 2011, Cell.

[50]  Raymond K. Auerbach,et al.  A User's Guide to the Encyclopedia of DNA Elements (ENCODE) , 2011, PLoS biology.

[51]  R. Wilson,et al.  Modernizing Reference Genome Assemblies , 2011, PLoS biology.

[52]  G. Getz,et al.  High-order chromatin architecture shapes the landscape of chromosomal alterations in cancer , 2012 .

[53]  Ira M. Hall,et al.  Complex reorganization and predominant non-homologous repair following chromosomal breakage in karyotypically balanced germline rearrangements and transgenic integration , 2012, Nature Genetics.

[54]  Ira M. Hall,et al.  Characterizing complex structural variation in germline and somatic genomes. , 2012, Trends in genetics : TIG.