Copy Number Variation detection from 1000 Genomes project exon capture sequencing data

BackgroundDNA capture technologies combined with high-throughput sequencing now enable cost-effective, deep-coverage, targeted sequencing of complete exomes. This is well suited for SNP discovery and genotyping. However there has been little attention devoted to Copy Number Variation (CNV) detection from exome capture datasets despite the potentially high impact of CNVs in exonic regions on protein function.ResultsAs members of the 1000 Genomes Project analysis effort, we investigated 697 samples in which 931 genes were targeted and sampled with 454 or Illumina paired-end sequencing. We developed a rigorous Bayesian method to detect CNVs in the genes, based on read depth within target regions. Despite substantial variability in read coverage across samples and targeted exons, we were able to identify 107 heterozygous deletions in the dataset. The experimentally determined false discovery rate (FDR) of the cleanest dataset from the Wellcome Trust Sanger Institute is 12.5%. We were able to substantially improve the FDR in a subset of gene deletion candidates that were adjacent to another gene deletion call (17 calls). The estimated sensitivity of our call-set was 45%.ConclusionsThis study demonstrates that exonic sequencing datasets, collected both in population based and medical sequencing projects, will be a useful substrate for detecting genic CNV events, particularly deletions. Based on the number of events we found and the sensitivity of the methods in the present dataset, we estimate on average 16 genic heterozygous deletions per individual genome. Our power analysis informs ongoing and future projects about sequencing depth and uniformity of read coverage required for efficient detection.

[1]  Marco A. Marra,et al.  Assessment of algorithms for high throughput detection of genomic copy number variation in oligonucleotide microarray data , 2007, BMC Bioinformatics.

[2]  E. Eichler,et al.  Mutational and selective effects on copy-number variants in the human genome , 2007, Nature Genetics.

[3]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[4]  Kenny Q. Ye,et al.  Large-Scale Copy Number Polymorphism in the Human Genome , 2004, Science.

[5]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[6]  BMC Bioinformatics , 2005 .

[7]  Peter H. Sudmant,et al.  Diversity of Human Copy Number Variation and Multicopy Genes , 2010, Science.

[8]  Bradley P. Coe,et al.  Copy number variation detection and genotyping from exome sequence data , 2012, Genome research.

[9]  Kenny Q. Ye,et al.  Sensitive and accurate detection of copy number variants using read depth of coverage. , 2009, Genome research.

[10]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.

[11]  J. Sebat,et al.  The role of rare structural variants in the genetics of autism spectrum disorders , 2009, Cytogenetic and Genome Research.

[12]  Kenny Q. Ye,et al.  Mapping copy number variation by population scale genome sequencing , 2010, Nature.

[13]  Life Technologies,et al.  A map of human genome variation from population-scale sequencing , 2011 .

[14]  D. Cutler,et al.  Microdeletions of 3q29 confer high risk for schizophrenia. , 2010, American journal of human genetics.

[15]  Kenny Q. Ye,et al.  Strong Association of De Novo Copy Number Mutations with Autism , 2007, Science.

[16]  Alfons Meindl,et al.  Copy number variant in the candidate tumor suppressor gene MTUS1 and familial breast cancer risk. , 2007, Carcinogenesis.

[17]  John Quackenbush,et al.  Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV , 2011, Bioinform..

[18]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[19]  E. Petretto,et al.  Gene copy number variation and common human disease , 2010, Clinical genetics.

[20]  M. Teh,et al.  Upregulation of FOXM1 induces genomic instability in human epidermal keratinocytes , 2010, Molecular Cancer.