Toward accurate high-throughput SNP genotyping in the presence of inherited copy number variation

BackgroundThe recent discovery of widespread copy number variation in humans has forced a shift away from the assumption of two copies per locus per cell throughout the autosomal genome. In particular, a SNP site can no longer always be accurately assigned one of three genotypes in an individual. In the presence of copy number variability, the individual may theoretically harbor any number of copies of each of the two SNP alleles.ResultsTo address this issue, we have developed a method to infer a "generalized genotype" from raw SNP microarray data. Here we apply our approach to data from 48 individuals and uncover thousands of aberrant SNPs, most in regions that were previously unreported as copy number variants. We show that our allele-specific copy numbers follow Mendelian inheritance patterns that would be obscured in the absence of SNP allele information. The interplay between duplication and point mutation in our data shed light on the relative frequencies of these events in human history, showing that at least some of the duplication events were recurrent.ConclusionThis new multi-allelic view of SNPs has a complicated role in disease association studies, and further work will be necessary in order to accurately assess its importance. Software to perform generalized genotyping from SNP array data is freely available online [1].

[1]  J. Sebat,et al.  Representational oligonucleotide microarray analysis: a high-resolution method to detect genome copy number variation. , 2003, Genome research.

[2]  J. Lupski,et al.  Genetic proof of unequal meiotic crossovers in reciprocal deletion and duplication of 17p11.2. , 2002, American journal of human genetics.

[3]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[4]  Ajay N. Jain,et al.  Assembly of microarrays for genome-wide measurement of DNA copy number , 2001, Nature Genetics.

[5]  Ton Feuth,et al.  Diagnostic genome profiling in mental retardation. , 2005, American journal of human genetics.

[6]  Christian A. Rees,et al.  Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Kenny Q. Ye,et al.  Large-Scale Copy Number Polymorphism in the Human Genome , 2004, Science.

[8]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[9]  R. Redon,et al.  Copy Number Variation: New Insights in Genome Diversity References , 2006 .

[10]  Herbert Herzog,et al.  Y4 receptor knockout rescues fertility in ob/ob mice. , 2002, Genes & development.

[11]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[12]  Murat Bastepe,et al.  A rapid microarray based whole genome analysis for detection of uniparental disomy , 2005, Human mutation.

[13]  Jing Huang,et al.  Dynamic model based algorithms for screening and genotyping over 100K SNPs on oligonucleotide microarrays , 2005, Bioinform..

[14]  D. Zwijnenburg,et al.  Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification. , 2002, Nucleic acids research.

[15]  Anthony J Brookes,et al.  Complex SNP-related sequence variation in segmental genome duplications , 2004, Nature Genetics.

[16]  M. Humbert,et al.  BMPR2 gene rearrangements account for a significant proportion of mutations in familial and idiopathic pulmonary arterial hypertension , 2006, Human mutation.

[17]  David Harrington,et al.  PLASQ: a generalized linear model-based procedure to determine allelic dosage in cancer cells from SNP array data. , 2007, Biostatistics.

[18]  B. Trask,et al.  Members of the olfactory receptor gene family are contained in large blocks of DNA duplicated polymorphically near the ends of human chromosomes. , 1998, Human molecular genetics.

[19]  Cheng Li,et al.  Allele-Specific Amplification in Cancer Revealed by SNP Array Analysis , 2005, PLoS Comput. Biol..

[20]  Xavier Estivill,et al.  Complex patterns of copy number variation at sites of segmental duplications: an important category of structural variation in the human genome , 2006, Human Genetics.

[21]  D. Conrad,et al.  A high-resolution survey of deletion polymorphism in the human genome , 2006, Nature Genetics.

[22]  E. Eichler,et al.  Segmental duplications and copy-number variation in the human genome. , 2005, American journal of human genetics.

[23]  Yong-shu He,et al.  [Structural variation in the human genome]. , 2009, Yi chuan = Hereditas.

[24]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[25]  Charles Lee,et al.  Genome-wide detection of human copy number variations using high-density DNA oligonucleotide arrays. , 2006, Genome research.

[26]  Robert Gentleman,et al.  Using GOstats to test gene lists for GO term association , 2007, Bioinform..

[27]  L. Feuk,et al.  Structural variation in the human genome , 2006, Nature Reviews Genetics.

[28]  Sarah Barber,et al.  Oligonucleotide microarray analysis of genomic imbalance in children with mental retardation. , 2006, American journal of human genetics.

[29]  Pardis C Sabeti,et al.  Common deletion polymorphisms in the human genome , 2006, Nature Genetics.

[30]  R. Leinonen,et al.  Global analysis of uniparental disomy using high density genotyping arrays , 2005, Journal of Medical Genetics.