Current analysis platforms and methods for detecting copy number variation.

Copy number variation (CNV), generated through duplication or deletion events that affect one or more loci, is widespread in the human genomes and is often associated with functional consequences that may include changes in gene expression levels or fusion of genes. Genome-wide association studies indicate that some disease phenotypes and physiological pathways might be impacted by CNV in a small number of characterized genomic regions. However, the pervasiveness and full impact of such variation remains unclear. Suitable analytic methods are needed to thoroughly mine human genomes for genomic structural variation, and to explore the interplay between observed CNV and disease phenotypes, but many medical researchers are unfamiliar with the features and nuances of recently developed technologies for detecting CNV. In this article, we evaluate a suite of commonly used and recently developed approaches to uncovering genome-wide CNVs and discuss the relative merits of each.

[1]  Martin Vingron,et al.  Detecting genomic indel variants with exact breakpoints in single- and paired-end sequencing data using SplazerS , 2012, Bioinform..

[2]  Peter R. Cook,et al.  Happy mapping: linkage mapping using a physical analogue of meiosis , 1993, Nucleic Acids Res..

[3]  P. George,et al.  Microdissection molecular copy‐number counting (µMCC)—unlocking cancer archives with digital PCR , 2008, The Journal of pathology.

[4]  Joseph T. Glessner,et al.  Large Copy-Number Variations Are Enriched in Cases With Moderate to Extreme Obesity , 2010, Diabetes.

[5]  Chromosome Instability Is Common in Human Cleavage-Stage Embryos , 2012 .

[6]  Gregory M. Cooper,et al.  A Copy Number Variation Morbidity Map of Developmental Delay , 2011, Nature Genetics.

[7]  Joshua M. Korn,et al.  Mapping and sequencing of structural variation from eight human genomes , 2008, Nature.

[8]  Joshua M. Korn,et al.  Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs , 2008, Nature Genetics.

[9]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[10]  T. Rabbitts,et al.  Corrigendum: Interrogation of genomes by molecular copy-number counting (MCC) , 2006, Nature Methods.

[11]  Ney Alliey-Rodriguez,et al.  Accuracy of CNV Detection from GWAS Data , 2011, PloS one.

[12]  R. Wilson,et al.  BreakDancer: An algorithm for high resolution mapping of genomic structural variation , 2009, Nature Methods.

[13]  Joseph T. Glessner,et al.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. , 2007, Genome research.

[14]  Tatiana Popova,et al.  Supplementary Methods , 2012, Acta Neuropsychiatrica.

[15]  R. Scharpf,et al.  A multilevel model to address batch effects in copy number estimation using SNP arrays. , 2011, Biostatistics.

[16]  Huanming Yang,et al.  De novo assembly of human genomes with massively parallel short read sequencing. , 2010, Genome research.

[17]  K. Gunderson,et al.  High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. , 2006, Genome research.

[18]  Angel Rubio,et al.  ACNE: a summarization method to estimate allele-specific copy numbers for Affymetrix SNP arrays , 2010, Bioinform..

[19]  J. Lupski,et al.  Genomic rearrangements and sporadic disease , 2007, Nature Genetics.

[20]  D. Campion,et al.  APP locus duplication causes autosomal dominant early-onset Alzheimer disease with cerebral amyloid angiopathy , 2006, Nature Genetics.

[21]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[22]  Mark Gerstein,et al.  AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision , 2011, Bioinform..

[23]  P. Stankiewicz,et al.  Structural variation in the human genome and its role in disease. , 2010, Annual review of medicine.

[24]  Ryan E. Mills,et al.  An initial map of insertion and deletion (INDEL) variation in the human genome. , 2006, Genome research.

[25]  Kenny Q. Ye,et al.  Mapping copy number variation by population scale genome sequencing , 2010, Nature.

[26]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[27]  Huanming Yang,et al.  Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly , 2011, Nature Biotechnology.

[28]  Christopher Yau,et al.  Comparing CNV detection methods for SNP arrays. , 2009, Briefings in functional genomics & proteomics.

[29]  C. Yau,et al.  QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data , 2007, Nucleic acids research.

[30]  Yves Moreau,et al.  Single-cell chromosomal imbalances detection by array CGH , 2006, Nucleic acids research.

[31]  J. Shendure,et al.  De novo mutations in the actin genes ACTB and ACTG1 cause Baraitser-Winter syndrome , 2012, Nature Genetics.

[32]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[33]  Emmanuel Barillot,et al.  Analysis of array CGH data: from signal ratio to gain and loss of DNA regions , 2004, Bioinform..

[34]  Yi Zhang,et al.  Copy number alterations that predict metastatic capability of human breast cancer. , 2009, Cancer research.

[35]  C. Alkan,et al.  MoDIL: detecting small indels from clone-end sequencing with mixtures of distributions , 2009, Nature Methods.

[36]  Zaïd Harchaoui,et al.  Catching Change-points with Lasso , 2007, NIPS.

[37]  R. W. Bentley,et al.  Association of Higher DEFB4 Genomic Copy Number With Crohn's Disease , 2010, The American Journal of Gastroenterology.

[38]  Hugo Y. K. Lam,et al.  Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library , 2010, Nature Biotechnology.

[39]  Peter J. Park,et al.  Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data , 2005, Bioinform..

[40]  Marco A. Marra,et al.  Assessment of algorithms for high throughput detection of genomic copy number variation in oligonucleotide microarray data , 2007, BMC Bioinformatics.

[41]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[42]  Hugo Y. K. Lam,et al.  Identification of genomic indels and structural variations using split reads , 2011, BMC Genomics.

[43]  L. Chin,et al.  A comparison of DNA copy number profiling platforms. , 2008, Cancer research.

[44]  Yufeng Shen,et al.  A Hidden Markov Model for Copy Number Variant prediction from whole genome resequencing data , 2011, 2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences (ICCABS).

[45]  E. S. Venkatraman,et al.  A faster circular binary segmentation algorithm for the analysis of array CGH data , 2007, Bioinform..

[46]  I. Tikhonova,et al.  Genetic diagnosis by whole exome capture and massively parallel DNA sequencing , 2009, Proceedings of the National Academy of Sciences.

[47]  Philip M. Kim,et al.  Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome , 2007, Science.

[48]  M. Wigler,et al.  Circular binary segmentation for the analysis of array-based DNA copy number data. , 2004, Biostatistics.

[49]  OrtegaAntonio,et al.  Sparse representation and Bayesian detection of genome copy number alterations from microarray data , 2008 .

[50]  Chao Xie,et al.  CNV-seq, a new method to detect copy number variation using high-throughput sequencing , 2009, BMC Bioinformatics.

[51]  Aravinda Chakravarti,et al.  DNA duplication associated with Charcot-Marie-Tooth disease type 1A , 1991, Cell.

[52]  André Reis,et al.  Psoriasis is associated with increased beta-defensin genomic copy number. , 2008, Nature genetics.

[53]  Thomas M. Keane,et al.  Enhanced structural variant and breakpoint detection using SVMerge by integration of multiple detection methods and local assembly , 2010, Genome Biology.

[54]  M. Gerstein,et al.  CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. , 2011, Genome research.

[55]  Francisco M. De La Vega,et al.  Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. , 2009, Genome research.

[56]  Derek Y. Chiang,et al.  High-resolution mapping of copy-number alterations with massively parallel sequencing , 2009, Nature Methods.

[57]  Pawel Stankiewicz,et al.  Genomic Disorders: Molecular Mechanisms for Rearrangements and Conveyed Phenotypes , 2005, PLoS genetics.

[58]  Misko Dzamba,et al.  Detecting copy number variation with mated short reads. , 2010, Genome research.

[59]  Emmanuel Barillot,et al.  ITALICS: an algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays , 2008, Bioinform..

[60]  A. Saykin,et al.  Analysis of copy number variation in Alzheimer's disease: the NIALOAD/ NCRAD Family Study. , 2012, Current Alzheimer research.

[61]  Ping Fang,et al.  De novo truncating mutations in E6-AP ubiquitin-protein ligase gene (UBE3A) in Angelman syndrome , 1997, Nature Genetics.

[62]  John Quackenbush,et al.  Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV , 2011, Bioinform..

[63]  G. Verbeke,et al.  Microarray analysis of copy number variation in single cells , 2012, Nature Protocols.

[64]  D. Geschwind,et al.  Genomic medicine enters the neurology clinic , 2012, Neurology.

[65]  D. Hartl,et al.  A portrait of copy-number polymorphism in Drosophila melanogaster , 2007, Proceedings of the National Academy of Sciences.

[66]  Yidong Chen,et al.  A model-based circular binary segmentation algorithm for the analysis of array CGH data , 2011, BMC Research Notes.

[67]  David Tuck,et al.  MixHMM: Inferring Copy Number Variation and Allelic Imbalance Using SNP Arrays and Tumor Samples Mixed with Stromal Cells , 2010, PloS one.

[68]  Antonio Ortega,et al.  Sparse representation and Bayesian detection of genome copy number alterations from microarray data , 2008, Bioinform..

[69]  Kai Wang,et al.  Copy Number Variation Detection via High-Density SNP Genotyping. , 2008, CSH protocols.

[70]  Willem Talloen,et al.  cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate , 2011, Nucleic acids research.

[71]  J. Bilbao,et al.  Accuracy in Copy Number Calling by qPCR and PRT: A Matter of DNA , 2011, PloS one.

[72]  Faraz Hach,et al.  mrsFAST: a cache-oblivious algorithm for short-read mapping , 2010, Nature Methods.

[73]  Mark Gerstein,et al.  Genome-Wide Mapping of Copy Number Variation in Humans: Comparative Analysis of High Resolution Array Platforms , 2011, PloS one.

[74]  Fred A. Wright,et al.  Integrated study of copy number states and genotype calls using high-density SNP arrays , 2009, Nucleic acids research.

[75]  Rodolphe Barrangou,et al.  Human Copy Number Variation and Complex Genetic Disease , 2014 .

[76]  D. Conrad,et al.  Dosage Sensitivity Shapes the Evolution of Copy-Number Varied Regions , 2010, PloS one.

[77]  Juan R. González,et al.  R-Gada: a fast and flexible pipeline for copy number analysis in association studies , 2010, BMC Bioinformatics.

[78]  Xutao Deng,et al.  SeqGene: a comprehensive software solution for mining exome- and transcriptome- sequencing data , 2011, BMC Bioinformatics.

[79]  Fangqing Zhao,et al.  inGAP-sv: a novel scheme to identify and visualize structural variation from paired end mapping data , 2011, Nucleic Acids Res..

[80]  E. Cuppen,et al.  Application of exome sequencing in the search for genetic causes of rare disorders of copper metabolism. , 2012, Metallomics : integrated biometal science.

[81]  J. Del-Favero,et al.  Targeted screening and validation of copy number variations. , 2012, Methods in molecular biology.

[82]  M. Gerstein,et al.  PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data , 2009, Genome Biology.

[83]  Tanya M. Teslovich,et al.  Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index , 2010 .

[84]  K. Darvishi Application of Nexus Copy Number Software for CNV Detection and Analysis , 2010, Current protocols in human genetics.

[85]  Kenny Q. Ye,et al.  Sensitive and accurate detection of copy number variants using read depth of coverage. , 2009, Genome research.

[86]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[87]  D Rutovitz,et al.  Comparative genomic hybridization: a rapid new method for detecting and mapping DNA amplification in tumors. , 1993, Seminars in cancer biology.

[88]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[89]  Francesca Demichelis,et al.  Optimizing copy number variation analysis using genome-wide short sequence oligonucleotide arrays , 2010, Nucleic acids research.

[90]  A. Tsalenko,et al.  The fine-scale and complex architecture of human copy-number variation. , 2008, American journal of human genetics.

[91]  Gregory M. Cooper,et al.  Targeted interrogation of copy number variation using SCIMMkit , 2009, Bioinform..

[92]  Mark D. Johnson,et al.  Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion , 2011, Proceedings of the National Academy of Sciences.

[93]  Sebastian M. Waszak,et al.  Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity , 2010, PLoS Comput. Biol..

[94]  Xiaowu Gai,et al.  CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics , 2010, BMC Bioinformatics.

[95]  Arnald Alonso,et al.  CNstream: A method for the identification and genotyping of copy number polymorphisms using Illumina microarrays , 2010, BMC Bioinformatics.

[96]  A. Gnirke,et al.  High-quality draft assemblies of mammalian genomes from massively parallel sequence data , 2010, Proceedings of the National Academy of Sciences.