Integrated detection and population-genetic analysis of SNPs and copy number variation

Dissecting the genetic basis of disease risk requires measuring all forms of genetic variation, including SNPs and copy number variants (CNVs), and is enabled by accurate maps of their locations, frequencies and population-genetic properties. We designed a hybrid genotyping array (Affymetrix SNP 6.0) to simultaneously measure 906,600 SNPs and copy number at 1.8 million genomic locations. By characterizing 270 HapMap samples, we developed a map of human CNV (at 2-kb breakpoint resolution) informed by integer genotypes for 1,320 copy number polymorphisms (CNPs) that segregate at an allele frequency >1%. More than 80% of the sequence in previously reported CNV regions fell outside our estimated CNV boundaries, indicating that large (>100 kb) CNVs affect much less of the genome than initially reported. Approximately 80% of observed copy number differences between pairs of individuals were due to common CNPs with an allele frequency >5%, and more than 99% derived from inheritance rather than new mutation. Most common, diallelic CNPs were in strong linkage disequilibrium with SNPs, and most low-frequency CNVs segregated on specific SNP haplotypes.

[1]  Eric S. Lander,et al.  An SNP map of the human genome generated by reduced representation shotgun sequencing , 2000, Nature.

[2]  Pablo Tamayo,et al.  A strategy for oligonucleotide microarray probe reduction , 2002, Genome Biology.

[3]  Kenny Q. Ye,et al.  Large-Scale Copy Number Polymorphism in the Human Genome , 2004, Science.

[4]  Jonathan C. Cohen,et al.  Multiple Rare Alleles Contribute to Low Plasma Levels of HDL Cholesterol , 2004, Science.

[5]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[6]  S. Gabriel,et al.  Efficiency and power in genetic association studies , 2005, Nature Genetics.

[7]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[8]  E. Eichler,et al.  Segmental duplications and copy-number variation in the human genome. , 2005, American journal of human genetics.

[9]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[10]  E. Eichler,et al.  Fine-scale structural variation of the human genome , 2005, Nature Genetics.

[11]  L. Feuk,et al.  Development of bioinformatics resources for display and analysis of copy number and other structural variants in the human genome , 2006, Cytogenetic and Genome Research.

[12]  D. Conrad,et al.  A high-resolution survey of deletion polymorphism in the human genome , 2006, Nature Genetics.

[13]  M. Daly,et al.  Evaluating and improving power in whole-genome association studies using fixed marker sets , 2006, Nature Genetics.

[14]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[15]  K. Frazer,et al.  Common deletions and SNPs are in linkage disequilibrium in the human genome , 2006, Nature Genetics.

[16]  Jonathan C. Cohen,et al.  Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. , 2006, The New England journal of medicine.

[17]  D. Conrad,et al.  A worldwide survey of haplotype variation and linkage disequilibrium in the human genome , 2006, Nature Genetics.

[18]  Pardis C Sabeti,et al.  Common deletion polymorphisms in the human genome , 2006, Nature Genetics.

[19]  E. Eichler,et al.  Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. , 2006, American journal of human genetics.

[20]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[21]  R. Redon,et al.  Relative Impact of Nucleotide and Copy Number Variation on Gene Expression Phenotypes , 2007, Science.

[22]  Philip M. Kim,et al.  Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome , 2007, Science.

[23]  E. Eichler,et al.  Mutational and selective effects on copy-number variants in the human genome , 2007, Nature Genetics.

[24]  E. Eichler,et al.  Population Stratification of a Common APOBEC Gene Deletion Polymorphism , 2007, PLoS genetics.

[25]  Fan Shen,et al.  Improved detection of global copy number variation using high density, non-polymorphic oligonucleotide probes , 2008, BMC Genetics.

[26]  Fernando A. Villanea,et al.  Diet and the evolution of human amylase gene copy number variation , 2007, Nature Genetics.

[27]  S. Mccarroll,et al.  Copy-number variation and association studies of human disease , 2007, Nature Genetics.

[28]  E. Birney,et al.  Challenges and standards in integrating surveys of structural variation , 2007, Nature Genetics.

[29]  Justin O. Borevitz,et al.  Redundancy in Genotyping Arrays , 2007, PloS one.

[30]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[31]  Zachary A. Szpiech,et al.  Genotype, haplotype and copy-number variation in worldwide human populations , 2008, Nature.

[32]  S. Mccarroll Copy-number analysis goes more than skin deep , 2008, Nature Genetics.

[33]  Joshua M. Korn,et al.  Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs , 2008, Nature Genetics.

[34]  Joshua M. Korn,et al.  Mapping and sequencing of structural variation from eight human genomes , 2008, Nature.

[35]  Judy H Cho,et al.  Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease , 2008, Nature Genetics.