Complex SNP-related sequence variation in segmental genome duplications

There is uncertainty about the true nature of predicted single-nucleotide polymorphisms (SNPs) in segmental duplications (duplicons) and whether these markers genuinely exist at increased density as indicated in public databases. We explored these issues by genotyping 157 predicted SNPs in duplicons and control regions in normal diploid genomes and fully homozygous complete hydatidiform moles. Our data identified many true SNPs in duplicon regions and few paralogous sequence variants. Twenty-eight percent of the polymorphic duplicon sequences we tested involved multisite variation, a new type of polymorphism representing the sum of the signals from many individual duplicon copies that vary in sequence content due to duplication, deletion or gene conversion. Multisite variations can masquerade as normal SNPs when genotyped. Given that duplicons comprise at least 5% of the genome and many are yet to be annotated in the genome draft, effective strategies to identify multisite variation must be established and deployed.

[1]  Circe W. Tsui,et al.  Single nucleotide polymorphisms (SNPs) that map to gaps in the human SNP map. , 2003, Nucleic acids research.

[2]  A. Jeffreys,et al.  Intense and highly localized gene conversion activity in human meiotic crossover hot spots , 2004, Nature Genetics.

[3]  A. Brookes,et al.  DFold: PCR design that minimizes secondary structure and optimizes downstream genotyping applications , 2004, Human mutation.

[4]  J. R. MacDonald,et al.  Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence , 2003, Genome Biology.

[5]  M. Hurles Gene conversion homogenizes the CMT1A paralogous repeats , 2001, BMC Genomics.

[6]  Randall A. Bolanos,et al.  Whole-genome shotgun assembly and comparison of human genome assemblies , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[7]  D. Zwijnenburg,et al.  Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification. , 2002, Nucleic acids research.

[8]  B. Trask,et al.  Segmental duplications: organization and impact within the current human genome project assembly. , 2001, Genome research.

[9]  Xavier Estivill,et al.  Chromosomal regions containing high-density and ambiguously mapped putative single nucleotide polymorphisms (SNPs) correlate with segmental duplications in the human genome. , 2002, Human molecular genetics.

[10]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[11]  A. Wagner,et al.  Asymmetric sequence divergence of duplicate genes. , 2003, Genome research.

[12]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[13]  J. Lupski,et al.  Implications of human genome architecture for rearrangement-based disorders: the genomic basis of disease. , 2004, Human molecular genetics.

[14]  M. Daly,et al.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms , 2001, Nature.

[15]  Johan T den Dunnen,et al.  Comprehensive detection of genomic duplications and deletions in the DMD gene, by use of multiplex amplifiable probe hybridization. , 2002, American journal of human genetics.

[16]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[17]  J. Barber,et al.  Extensive normal copy number variation of a beta-defensin antimicrobial-gene cluster. , 2003, American journal of human genetics.

[18]  Neil J. Sebire,et al.  Histopathological Diagnosis of Partial and Complete Hydatidiform Mole in the First Trimester of Pregnancy , 2003, Pediatric and developmental pathology : the official journal of the Society for Pediatric Pathology and the Paediatric Pathology Society.

[19]  Deborah A. Nickerson,et al.  Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans , 2003, Nature Genetics.

[20]  Matthew Hurles,et al.  Are 100,000 "SNPs" Useless? , 2002, Science.

[21]  J. Sebat,et al.  Representational oligonucleotide microarray analysis: a high-resolution method to detect genome copy number variation. , 2003, Genome research.

[22]  Steve Rozen,et al.  Abundant gene conversion between arms of palindromes in human and ape Y chromosomes , 2003, Nature.

[23]  J. Schouten,et al.  Two‐color multiplex ligation‐dependent probe amplification: Detecting genomic rearrangements in hereditary multiple exostoses , 2004, Human mutation.

[24]  M. Adams,et al.  Recent Segmental Duplications in the Human Genome , 2002, Science.

[25]  L. Feuk,et al.  Robust and accurate single nucleotide polymorphism genotyping by dynamic allele-specific hybridization (DASH): design criteria and assay validation. , 2001, Genome research.

[26]  E. Eichler,et al.  BAC microarray analysis of 15q11–q13 rearrangements and the impact of segmental duplications , 2004, Journal of Medical Genetics.

[27]  D. Nickerson,et al.  Variation is the spice of life , 2001, Nature Genetics.

[28]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[29]  A. Smit Interspersed repeats and other mementos of transposable elements in mammalian genomes. , 1999, Current opinion in genetics & development.