Mapping and sequencing of structural variation from eight human genomes

Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes. Here we explore variation on an intermediate scale—particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1,695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number between individuals. Complete sequencing of 261 structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence map of human structural variation—a standard for genotyping platforms and a prelude to future individual genome sequencing projects.

[1]  H. Hobbs,et al.  Molecular definition of the extreme size polymorphism in apolipoprotein(a). , 1993, Human molecular genetics.

[2]  Owen White,et al.  TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects , 1995 .

[3]  M. Olson,et al.  Assembly of high-resolution restriction maps based on multiple complete digests of a redundant set of overlapping clones. , 1996, Genomics.

[4]  M. Olson,et al.  Multiple-complete-digest restriction fragment mapping: generating sequence-ready maps for large-scale DNA sequencing. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[5]  S. Warren,et al.  Emerin deletion reveals a common X-chromosome inversion mediated by inverted repeats , 1997, Nature Genetics.

[6]  M. Daly,et al.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms , 2001, Nature.

[7]  J. Mullikin,et al.  SSAHA: a fast search method for large DNA databases. , 2001, Genome research.

[8]  Dagmar Wieczorek,et al.  Heterozygous submicroscopic inversions involving olfactory receptor-gene clusters mediate the recurrent t(4;8)(p16;p23) translocation. , 2002, American journal of human genetics.

[9]  K. Chin,et al.  End-sequence profiling: Sequence-based analysis of aberrant genomes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Kenny Q. Ye,et al.  Large-Scale Copy Number Polymorphism in the Human Genome , 2004, Science.

[11]  Randall A. Bolanos,et al.  Whole-genome shotgun assembly and comparison of human genome assemblies , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[12]  I. Dunham,et al.  DNA sequence and analysis of human chromosome 9 , 2003, Nature.

[13]  Gary Benson,et al.  Inverted repeat structure of the human genome: the X-chromosome contains a preponderance of large, highly homologous inverted repeats that contain testes genes. , 2004, Genome research.

[14]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[15]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[16]  E. Eichler,et al.  Segmental duplications and copy-number variation in the human genome. , 2005, American journal of human genetics.

[17]  E. Eichler,et al.  A genome-wide comparison of recent chimpanzee and human segmental duplications , 2005, Nature.

[18]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[19]  H. Stefánsson,et al.  A common inversion under selection in Europeans , 2005, Nature Genetics.

[20]  E. Eichler,et al.  Fine-scale structural variation of the human genome , 2005, Nature Genetics.

[21]  B. Rovin,et al.  The Influence of CCL 3 L 1 Gene – Containing Segmental Duplications on HIV-1 / AIDS Susceptibility , 2009 .

[22]  Andrew J Sharp,et al.  Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome , 2006, Nature Genetics.

[23]  D. Conrad,et al.  A high-resolution survey of deletion polymorphism in the human genome , 2006, Nature Genetics.

[24]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[25]  K. Frazer,et al.  Common deletions and SNPs are in linkage disequilibrium in the human genome , 2006, Nature Genetics.

[26]  Stephen C. J. Parker,et al.  DNA sequence and analysis of human chromosome 8 , 2006, Nature.

[27]  R. Redon,et al.  Genome assembly comparison identifies structural variants in the human genome , 2006, Nature Genetics.

[28]  Bernhard Radlwimmer,et al.  A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to Crohn disease of the colon. , 2006, American journal of human genetics.

[29]  Pardis C Sabeti,et al.  Common deletion polymorphisms in the human genome , 2006, Nature Genetics.

[30]  Enrico Petretto,et al.  Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans , 2006, Nature.

[31]  J. Drenth,et al.  Short mucin 6 alleles are associated with H pylori infection. , 2006, World journal of gastroenterology.

[32]  Timothy B. Stockwell,et al.  The Diploid Genome Sequence of an Individual Human , 2007, PLoS biology.

[33]  Kenny Q. Ye,et al.  Strong Association of De Novo Copy Number Mutations with Autism , 2007, Science.

[34]  Philip M. Kim,et al.  Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome , 2007, Science.

[35]  E. Eichler,et al.  Mutational and selective effects on copy-number variants in the human genome , 2007, Nature Genetics.

[36]  W. Donahue,et al.  Fosmid Libraries for Genomic Structural Variation Detection , 2007, Current protocols in human genetics.

[37]  Philippe Froguel,et al.  Array CGH analysis of copy number variation identifies 1284 new genes variant in healthy white males: implications for association studies of complex diseases. , 2007, Human molecular genetics.

[38]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[39]  G. K. Vostokin,et al.  Chemical characterization of element 112 , 2007, Nature.

[40]  E. Birney,et al.  Challenges and standards in integrating surveys of structural variation , 2007, Nature Genetics.

[41]  Carolyn J. Brown,et al.  A comprehensive analysis of common copy-number variations in the human genome. , 2007, American journal of human genetics.

[42]  D. Altshuler,et al.  Completing the map of human genetic variation , 2007, Nature.

[43]  B. D. de Vries,et al.  Characterization of a recurrent 15q24 microdeletion syndrome. , 2007, Human molecular genetics.

[44]  André Reis,et al.  Psoriasis is associated with increased β-defensin genomic copy number , 2008, Nature Genetics.

[45]  E. Eichler,et al.  Closing gaps in the human genome with fosmid resources generated from multiple individuals , 2008, Nature Genetics.