Identification of copy number variation hotspots in human populations.

Copy number variants (CNVs) in the human genome contribute to both Mendelian and complex traits as well as to genomic plasticity in evolution. The investigation of mutational rates of CNVs is critical to understanding genomic instability and the etiology of the copy number variation (CNV)-related traits. However, the evaluation of the CNV mutation rate at the genome level poses an insurmountable practical challenge that requires large samples and accurate typing. In this study, we show that an approximate estimation of the CNV mutation rate could be achieved by using the phylogeny information of flanking SNPs. This allows a genome-wide comparison of mutation rates between CNVs with the use of vast, readily available data of SNP genotyping. A total of 4187 CNV regions (CNVRs) previously identified in HapMap populations were investigated in this study. We showed that the mutation rates for the majority of these CNVRs are at the order of 10⁻⁵ per generation, consistent with experimental observations at individual loci. Notably, the mutation rates of 104 (2.5%) CNVRs were estimated at the order of 10⁻³ per generation; therefore, they were identified as potential hotspots. Additional analyses revealed that genome architecture at CNV loci has a potential role in inciting mutational hotspots in the human genome. Interestingly, 49 (47%) CNV hotspots include human genes, some of which are known to be functional CNV loci (e.g., CNVs of C4 and β-defensin causing autoimmune diseases and CNVs of HYDIN with implication in control of cerebral cortex size), implicating the important role of CNV in human health and evolution, especially in common and complex diseases.

[1]  C. Ruivenkamp,et al.  Two short children born small for gestational age with insulin-like growth factor 1 receptor haploinsufficiency illustrate the heterogeneity of its phenotype. , 2009, The Journal of clinical endocrinology and metabolism.

[2]  A. Jeffreys,et al.  Processes of de novo duplication of human α-globin genes , 2007, Proceedings of the National Academy of Sciences.

[3]  P. Stankiewicz,et al.  Recurrent reciprocal 1q21.1 deletions and duplications associated with microcephaly or macrocephaly and developmental and behavioral abnormalities , 2008, Nature Genetics.

[4]  S. Gabriel,et al.  Calibrating a coalescent simulation of human genome sequence variation. , 2005, Genome research.

[5]  D. Gudbjartsson,et al.  A high-resolution recombination map of the human genome , 2002, Nature Genetics.

[6]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[7]  Ira M. Hall,et al.  Recurrent DNA copy number variation in the laboratory mouse , 2007, Nature Genetics.

[8]  J. Lupski,et al.  Complex human chromosomal and genomic rearrangements. , 2009, Trends in genetics : TIG.

[9]  A. Jeffreys,et al.  Processes of copy-number change in human DNA: the dynamics of {alpha}-globin gene deletion. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[10]  P. Donnelly,et al.  A new statistical method for haplotype reconstruction from population data. , 2001, American journal of human genetics.

[11]  M. Hurles,et al.  Copy number variation in human health, disease, and evolution. , 2009, Annual review of genomics and human genetics.

[12]  J. Lupski,et al.  The DNA replication FoSTeS/MMBIR mechanism can generate genomic, genic and exonic complex rearrangements in humans , 2009, Nature Genetics.

[13]  Suhaili Abu Bakar,et al.  Allelic recombination between distinct genomic locations generates copy number diversity in human β-defensins , 2009, Proceedings of the National Academy of Sciences.

[14]  N. Carter,et al.  Germline rates of de novo meiotic deletions and duplications causing several genomic disorders , 2008, Nature Genetics.

[15]  L. Excoffier,et al.  Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. , 1995, Molecular biology and evolution.

[16]  J. Lupski,et al.  Mechanisms for human genomic rearrangements , 2008, PathoGenetics.

[17]  G. V. Ommen Frequency of new copy number variation in humans , 2005, Nature Genetics.

[18]  Derek Y. Chiang,et al.  The landscape of somatic copy-number alteration across human cancers , 2010, Nature.

[19]  J. Lupski,et al.  Mechanisms of change in gene copy number , 2009, Nature Reviews Genetics.

[20]  Julian Lange,et al.  High mutation rates have driven extensive structural polymorphism among human Y chromosomes , 2006, Nature Genetics.

[21]  R. Redon,et al.  Copy Number Variation: New Insights in Genome Diversity References , 2006 .

[22]  Joshua M. Korn,et al.  Integrated detection and population-genetic analysis of SNPs and copy number variation , 2008, Nature Genetics.

[23]  André Reis,et al.  Psoriasis is associated with increased β-defensin genomic copy number , 2008, Nature Genetics.

[24]  P. Mieczkowski,et al.  Double-strand breaks associated with repetitive DNA can reshape the genome , 2008, Proceedings of the National Academy of Sciences.

[25]  Bi Zhou,et al.  Gene copy-number variation and associated polymorphisms of complement component C4 in human systemic lupus erythematosus (SLE): low copy number is a risk factor for and high copy number is a protective factor against SLE susceptibility in European Americans. , 2007, American journal of human genetics.

[26]  P. Donnelly,et al.  A Fine-Scale Map of Recombination Rates and Hotspots Across the Human Genome , 2005, Science.

[27]  R. Durbin,et al.  Mapping trait loci by use of inferred ancestral recombination graphs. , 2006, American journal of human genetics.

[28]  E. Cook,et al.  Population-specific GSTM1 copy number variation. , 2009, Human molecular genetics.

[29]  E. Eichler,et al.  Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. , 2006, American journal of human genetics.

[30]  B. Gener,et al.  Autism-specific copy number variants further implicate the phosphatidylinositol signaling pathway and the glutamatergic synapse in the etiology of the disorder , 2009, Human molecular genetics.

[31]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.

[32]  J. Lupski,et al.  Genomic rearrangements and sporadic disease , 2007, Nature Genetics.

[33]  Richard M Myers,et al.  Population analysis of large copy number variants and hotspots of human genetic disease. , 2009, American journal of human genetics.

[34]  B. Weir,et al.  ESTIMATING F‐STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE , 1984, Evolution; international journal of organic evolution.

[35]  J. Cartron,et al.  Genetic basis of the RhD-positive and RhD-negative blood group polymorphism as determined by Southern analysis , 1991 .

[36]  P. Stankiewicz,et al.  Genome architecture, rearrangements and genomic disorders. , 2002, Trends in genetics : TIG.

[37]  E. Eichler,et al.  Primate segmental duplications: crucibles of evolution, diversity and disease , 2006, Nature Reviews Genetics.

[38]  Laurent Excoffier,et al.  SIMCOAL 2.0: a program to simulate genomic diversity over large recombining regions in a subdivided population with a complex history , 2004, Bioinform..

[39]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[40]  Yusuke Nakamura,et al.  An algorithm for inferring complex haplotypes in a region of copy-number variation. , 2008, American journal of human genetics.

[41]  Ryan E. Mills,et al.  Discovery of common Asian copy number variants using integrated high-resolution array CGH and massively parallel DNA sequencing , 2010, Nature Genetics.

[42]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.