The CNVrd2 package: measurement of copy number at complex loci using high-throughput sequencing data

Recent advances in high-throughout sequencing technologies have made it possible to accurately assign copy number (CN) at CN variable loci. However, current analytic methods often perform poorly in regions in which complex CN variation is observed. Here we report the development of a read depth-based approach, CNVrd2, for investigation of CN variation using high-throughput sequencing data. This methodology was developed using data from the 1000 Genomes Project from the CCL3L1 locus, and tested using data from the DEFB103A locus. In both cases, samples were selected for which paralog ratio test data were also available for comparison. The CNVrd2 method first uses observed read-count ratios to refine segmentation results in one population. Then a linear regression model is applied to adjust the results across multiple populations, in combination with a Bayesian normal mixture model to cluster segmentation scores into groups for individual CN counts. The performance of CNVrd2 was compared to that of two other read depth-based methods (CNVnator, cn.mops) at the CCL3L1 and DEFB103A loci. The highest concordance with the paralog ratio test method was observed for CNVrd2 (77.8/90.4% for CNVrd2, 36.7/4.8% for cn.mops and 7.2/1% for CNVnator at CCL3L1 and DEF103A). CNVrd2 is available as an R package as part of the Bioconductor project: http://www.bioconductor.org/packages/release/bioc/html/CNVrd2.html.

[1]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.

[2]  T. Merriman,et al.  Evidence that deletion at FCGR3B is a risk factor for systemic sclerosis , 2012, Genes and Immunity.

[3]  R. W. Bentley,et al.  Association of Higher DEFB4 Genomic Copy Number With Crohn's Disease , 2010, The American Journal of Gastroenterology.

[4]  S. Steer,et al.  Evidence for an influence of chemokine ligand 3-like 1 (CCL3L1) gene copy number on susceptibility to rheumatoid arthritis , 2007, Annals of the rheumatic diseases.

[5]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[6]  Jane C Burns,et al.  Genetic variations in the receptor-ligand pair CCR5 and CCL3L1 are important determinants of susceptibility to Kawasaki disease. , 2005, The Journal of infectious diseases.

[7]  B. Rovin,et al.  CCL3L1 gene-containing segmental duplications and polymorphisms in CCR5 affect risk of systemic lupus erythaematosus , 2007, Annals of the rheumatic diseases.

[8]  D. Carpenter,et al.  Evolution of haplotypes at CCL3L1/CCL4L1 , 2010, Genome Biology.

[9]  R. Redon,et al.  Copy Number Variation: New Insights in Genome Diversity References , 2006 .

[10]  N. Prescott,et al.  Accuracy and differential bias in copy number measurement of CCL3L1 in association studies with three auto-immune disorders , 2011, BMC Genomics.

[11]  C. Tyler-Smith,et al.  A Worldwide Analysis of Beta-Defensin Copy Number Variation Suggests Recent Selection of a High-Expressing DEFB103 Gene Copy in East Asia , 2011, Human mutation.

[12]  M. Gerstein,et al.  CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. , 2011, Genome research.

[13]  M. den Heijer,et al.  Accurate, high-throughput typing of copy number variation using paralogue ratios from dispersed repeats , 2006, Nucleic acids research.

[14]  M. Plummer,et al.  CODA: convergence diagnosis and output analysis for MCMC , 2006 .

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[16]  Kristen K. Dang,et al.  CCL3L1 and HIV/AIDS susceptibility , 2009, Nature Medicine.

[17]  B. Franke,et al.  Association of variation in Fcγ receptor 3B gene copy number with rheumatoid arthritis in Caucasian samples , 2010, Annals of the rheumatic diseases.

[18]  Hoang T. Nguyen,et al.  CNVrd, a Read-Depth Algorithm for Assigning Copy-Number at the FCGR Locus: Population-Specific Tagging of Copy Number Variation at FCGR3B , 2013, PloS one.

[19]  J. Dazard,et al.  Copy Number Variation within Human β-Defensin Gene Cluster Influences Progression to AIDS in the Multicenter AIDS Cohort Study. , 2012, Journal of AIDS & clinical research.

[20]  S. Walker,et al.  Multiplex Paralogue Ratio Tests for accurate measurement of multiallelic CNVs. , 2009, Genomics.

[21]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[22]  B. Rovin,et al.  The Influence of CCL 3 L 1 Gene – Containing Segmental Duplications on HIV-1 / AIDS Susceptibility , 2009 .

[23]  Joe McCarthy,et al.  An integrated approach , 2001 .

[24]  Haeyong Lee,et al.  Copy number variation of CCL3L1 influences asthma risk by modulating IL-10 expression. , 2011, Clinica chimica acta; international journal of clinical chemistry.

[25]  M. Bamshad,et al.  Reply to: “CCL3L1 and HIV/AIDS susceptibility” and “Experimental aspects of copy number variant assays at CCL3L1” , 2009, Nature Medicine.

[26]  T. Merriman,et al.  Meta-analysis confirms a role for deletion in FCGR3B in autoimmune phenotypes. , 2012, Human molecular genetics.

[27]  Lei Yao,et al.  CCL3L1 Copy Number Variation and Susceptibility to HIV-1 Infection: A Meta-Analysis , 2010, PloS one.

[28]  S. Hochreiter,et al.  cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate , 2012, Nucleic acids research.

[29]  Min Chen,et al.  Higher DEFB4 genomic copy number in SLE and ANCA-associated small vasculitis. , 2012, Rheumatology.

[30]  Faraz Hach,et al.  mrsFAST: a cache-oblivious algorithm for short-read mapping , 2010, Nature Methods.

[31]  A. Zhernakova,et al.  Genetic Variation of the Fc Gamma Receptor 3B Gene and Association with Rheumatoid Arthritis , 2010, PloS one.

[32]  Anya Tsalenko,et al.  Population-genetic properties of differentiated human copy-number polymorphisms. , 2011, American journal of human genetics.

[33]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2013 , 2012, Nucleic Acids Res..

[34]  D. Ledbetter,et al.  Complex low-copy repeats associated with a common polymorphic inversion at human chromosome 8p23. , 2003, Genomics.

[35]  Christopher G Mathew,et al.  Measurement methods and accuracy in copy number variation: failure to replicate associations of beta-defensin copy number with Crohn's disease. , 2010, Human molecular genetics.

[36]  K. Huse,et al.  High‐resolution mapping of the 8p23.1 beta‐defensin cluster reveals strictly concordant copy number variation of all genes , 2008, Human mutation.

[37]  E. Hollox,et al.  An integrated approach for measuring copy number variation at the FCGR3 (CD16) locus , 2009, Human mutation.

[38]  Peter H. Sudmant,et al.  Diversity of Human Copy Number Variation and Multicopy Genes , 2010, Science.