Detecting Copy Number Variation via Next Generation Technology

Purpose of ReviewCopy number variants (CNVs), gains and losses of segments of genomic DNA associated with normal phenotypic variation and disease states, are traditionally detected using chromosomal microarrays. Recent bioinformatic advances now allow for the detection of CNVs using next generation sequencing (NGS) data, greatly increasing the clinical utility of NGS tests.Recent FindingsThough not widespread, clinical diagnostic laboratories have started to implement CNV detection from targeted NGS gene panels and whole exome sequencing data, despite some limitations. Multiple tools have been designed to overcome these limitations, with some promising results. However, no single tool yet enables the high sensitivity and specificity needed to make it more than a supplementary assay for clinical laboratories.SummaryAs sequencing costs drop and sequencing technologies improve, some of these shortcomings may be overcome by whole genome sequencing or long-read sequencing technologies. Here, we review methods used to detect CNVs from NGS data, including studies comparing their performance.

[1]  Yu-ping Wang,et al.  Comparative Studies of Copy Number Variation Detection Methods for Next-Generation Sequencing Technologies , 2013, PloS one.

[2]  John Quackenbush,et al.  Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV , 2011, Bioinform..

[3]  Paul Medvedev,et al.  Computational methods for discovering structural variation with next-generation sequencing , 2009, Nature Methods.

[4]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[5]  Andrew J Sharp,et al.  The genetics of microdeletion and microduplication syndromes: an update. , 2014, Annual review of genomics and human genetics.

[6]  Eric J Duncavage,et al.  Detection of structural DNA variation from next generation sequencing data: a review of informatic approaches. , 2013, Cancer genetics.

[7]  Z. Ning,et al.  Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of GC-biased genomes , 2009, Nature Methods.

[8]  Clara Gaff,et al.  Patient safety in genomic medicine: an exploratory study , 2016, Genetics in Medicine.

[9]  Birgit Funke,et al.  VisCap: inference and visualization of germ-line copy-number variants from targeted clinical sequencing data , 2015, Genetics in Medicine.

[10]  Tatiana Popova,et al.  Supplementary Methods , 2012, Acta Neuropsychiatrica.

[11]  Xiaowu Gai,et al.  CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics , 2010, BMC Bioinformatics.

[12]  Lars Feuk,et al.  The Database of Genomic Variants: a curated collection of structural variation in the human genome , 2013, Nucleic Acids Res..

[13]  Tetsuya Hayashi,et al.  Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads , 2014, Genome research.

[14]  P. D. Dal Cin,et al.  Ewing sarcoma mimicking atypical carcinoid tumor: detection of unexpected genomic alterations demonstrates the use of next generation sequencing as a diagnostic tool. , 2014, Cancer genetics.

[15]  L. Vissers,et al.  Genome sequencing identifies major causes of severe intellectual disability , 2014, Nature.

[16]  T. Fennell,et al.  Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries , 2011, Genome Biology.

[17]  Xiaolin Zhu,et al.  An Evaluation of Copy Number Variation Detection Tools from Whole‐Exome Sequencing Data , 2014, Human mutation.

[18]  Vincent J. Henry,et al.  OMICtools: an informative directory for multi-omic data analysis , 2014, Database J. Biol. Databases Curation.

[19]  Sampsa Hautaniemi,et al.  Comparative analysis of methods for identifying somatic copy number alterations from deep sequencing data , 2015, Briefings Bioinform..

[20]  Michael Brudno,et al.  Whole-genome sequencing expands diagnostic utility and improves clinical management in paediatric medicine , 2016, npj Genomic Medicine.

[21]  Peter J. Park,et al.  Evaluation of somatic copy number estimation tools for whole-exome sequencing data , 2016, Briefings Bioinform..

[22]  Emily H Turner,et al.  Target-enrichment strategies for next-generation sequencing , 2010, Nature Methods.

[23]  R. Wilson,et al.  BreakDancer: An algorithm for high resolution mapping of genomic structural variation , 2009, Nature Methods.

[24]  W. Wong,et al.  Improving PacBio Long Read Accuracy by Short Read Alignment , 2012, PloS one.

[25]  Rémy Bruggmann,et al.  Clinical sequencing: is WGS the better WES? , 2016, Human Genetics.

[26]  Ying Sheng,et al.  Identification of copy number variants from exome sequence data , 2014, BMC Genomics.

[27]  S. Mundlos,et al.  Comparison of Exome and Genome Sequencing Technologies for the Complete Capture of Protein‐Coding Regions , 2015, Human mutation.

[28]  Mark Gerstein,et al.  MetaSV: an accurate and integrative structural-variant caller for next generation sequencing , 2015, Bioinform..

[29]  M. Gerstein,et al.  CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. , 2011, Genome research.

[30]  J. Shendure,et al.  Massively parallel sequencing and rare disease. , 2010, Human molecular genetics.

[31]  Emmanuel Barillot,et al.  Control-free calling of copy number alterations in deep-sequencing data using GC-content normalization , 2010, Bioinform..

[32]  S. Halgamuge,et al.  Inferring copy number and genotype in tumour exome data , 2014, BMC Genomics.

[33]  Frank Reinecke,et al.  Quantitative analysis of differences in copy numbers using read depth obtained from PCR-enriched samples and controls , 2015, BMC Bioinformatics.

[34]  Yan Guo,et al.  Comparative Study of Exome Copy Number Variation Estimation Tools Using Array Comparative Genomic Hybridization as Control , 2013, BioMed research international.

[35]  Wolfgang Losert,et al.  svclassify: a method to establish benchmark structural variant calls , 2015, BMC Genomics.

[36]  F. Zou,et al.  Clinical Impact and Cost-Effectiveness of Whole Exome Sequencing as a Diagnostic Tool: A Pediatric Center’s Experience , 2015, Front. Pediatr..

[37]  S. Hochreiter,et al.  cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate , 2012, Nucleic acids research.

[38]  Zhongming Zhao,et al.  CNVannotator: A Comprehensive Annotation Server for Copy Number Variation in the Human Genome , 2013, PloS one.

[39]  Committee Opinion No. 581: the use of chromosomal microarray analysis in prenatal diagnosis. , 2013, Obstetrics and gynecology.

[40]  Christopher A. Miller,et al.  VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. , 2012, Genome research.

[41]  S. South,et al.  American College of Medical Genetics standards and guidelines for interpretation and reporting of postnatal constitutional copy number variants , 2011, Genetics in Medicine.

[42]  Michael E Talkowski,et al.  Clinical diagnosis by whole-genome sequencing of a prenatal sample. , 2012, The New England journal of medicine.

[43]  Caleb Webber,et al.  Bias of Selection on Human Copy-Number Variants , 2006, PLoS genetics.

[44]  Yiping Shen,et al.  Next-generation sequencing strategies enable routine detection of balanced chromosome rearrangements for clinical diagnostics and genetic research. , 2011, American journal of human genetics.

[45]  W. Hahn,et al.  BreaKmer: detection of structural variation in targeted massively parallel sequencing data using kmers , 2014, Nucleic acids research.

[46]  Mark J. P. Chaisson,et al.  Reconstructing complex regions of genomes using long-read sequencing technology , 2014, Genome research.

[47]  Marie-Pierre Dubé,et al.  Comparison of Sequencing Based CNV Discovery Methods Using Monozygotic Twin Quartets , 2015, PloS one.

[48]  L. Hudgins,et al.  Array-based technology and recommendations for utilization in medical genetics practice for detection of chromosomal abnormalities , 2010, Genetics in Medicine.

[49]  Andrew Collins,et al.  Exome sequence read depth methods for identifying copy number changes , 2015, Briefings Bioinform..

[50]  K. Buysse,et al.  Copy number alterations and copy number variation in cancer: close encounters of the bad kind , 2009, Cytogenetic and Genome Research.

[51]  Neha Deshpande,et al.  SG-ADVISER CNV: copy-number variant annotation and interpretation , 2014, Genetics in Medicine.

[52]  Matthew S. Lebo,et al.  American College of Medical Genetics and Genomics technical standards and guidelines: microarray analysis for chromosome abnormalities in neoplastic disorders , 2013, Genetics in Medicine.

[53]  Hugo Y. K. Lam,et al.  Identification of genomic indels and structural variations using split reads , 2011, BMC Genomics.

[54]  S. Turner,et al.  Real-Time DNA Sequencing from Single Polymerase Molecules , 2009, Science.

[55]  C. Hendrickson,et al.  Overview of Target Enrichment Strategies , 2015, Current protocols in molecular biology.

[56]  Leslie G Biesecker,et al.  Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. , 2010, American journal of human genetics.

[57]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[58]  Mauricio O. Carneiro,et al.  Pacific biosciences sequencing technology for genotyping and variation discovery in human data , 2012, BMC Genomics.

[59]  B. Giusti,et al.  EXCAVATOR: detecting copy number variants from whole-exome sequencing data , 2013, Genome Biology.

[60]  J. Lupski Structural variation in the human genome. , 2007, The New England journal of medicine.

[61]  G van den Engh,et al.  Large multi-chromosomal duplications encompass many members of the olfactory receptor gene family in the human genome. , 1998, Human molecular genetics.

[62]  Manuel Corpas,et al.  DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources. , 2009, American journal of human genetics.

[63]  Y. Benjamini,et al.  Summarizing and correcting the GC content bias in high-throughput sequencing , 2012, Nucleic acids research.

[64]  David M. Simcha,et al.  Tackling the widespread and critical impact of batch effects in high-throughput data , 2010, Nature Reviews Genetics.

[65]  Yanming Feng,et al.  Improved molecular diagnosis by the detection of exonic deletions with target gene capture and deep sequencing , 2014, Genetics in Medicine.

[66]  P. Ashton,et al.  MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island , 2014, Nature Biotechnology.

[67]  J. Veltman,et al.  Clinical exome sequencing in daily practice: 1,000 patients and beyond , 2014, Genome Medicine.

[68]  M. Gerstein,et al.  PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data , 2009, Genome Biology.

[69]  Kenny Q. Ye,et al.  Sensitive and accurate detection of copy number variants using read depth of coverage. , 2009, Genome research.

[70]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.

[71]  Adam M. Phillippy,et al.  Comparative genome assembly , 2004, Briefings Bioinform..

[72]  Vikas Bansal,et al.  Outlier-Based Identification of Copy Number Variations Using Targeted Resequencing in a Small Cohort of Patients with Tetralogy of Fallot , 2014, PloS one.

[73]  Huanming Yang,et al.  Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly , 2011, Nature Biotechnology.

[74]  Alicja Szabelska,et al.  Precise breakpoint localization of large genomic deletions using PacBio and Illumina next-generation sequencers. , 2013, BioTechniques.

[75]  Y. J. Kim,et al.  Combinatorial approach to estimate copy number genotype using whole-exome sequencing data. , 2015, Genomics.

[76]  S. Bale,et al.  Assessing copy number from exome sequencing and exome array CGH based on CNV spectrum in a large clinical cohort , 2014, Genetics in Medicine.

[77]  Richard M Myers,et al.  Population analysis of large copy number variants and hotspots of human genetic disease. , 2009, American journal of human genetics.

[78]  Michael E Talkowski,et al.  Describing sequencing results of structural chromosome rearrangements with a suggested next-generation cytogenetic nomenclature. , 2014, American journal of human genetics.

[79]  Qingguo Wang,et al.  Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives , 2013, BMC Bioinformatics.

[80]  John D Pfeifer,et al.  Targeted next generation sequencing of clinically significant gene mutations and translocations in leukemia , 2012, Modern Pathology.

[81]  E. Thorland,et al.  Towards an evidence‐based process for the clinical interpretation of copy number variation , 2012, Clinical genetics.

[82]  Mark D. Johnson,et al.  Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion , 2011, Proceedings of the National Academy of Sciences.

[83]  Martin Vingron,et al.  Statistical Applications in Genetics and Molecular Biology Modeling Read Counts for CNV Detection in Exome Sequencing Data , 2011 .

[84]  Jun Zhang,et al.  Low-pass whole-genome sequencing in clinical cytogenetics: a validated approach , 2016, Genetics in Medicine.

[85]  C. Baker,et al.  A burst of segmental duplications in the genome of the African great ape ancestor , 2009, Nature.

[86]  Nicholas W. Wood,et al.  A robust model for read count data in exome sequencing experiments and implications for copy number variant calling , 2012, Bioinform..

[87]  Cory Y. McLean,et al.  Human-specific loss of regulatory DNA and the evolution of human-specific traits , 2011, Nature.

[88]  Tatiana Popova,et al.  Multi-factor data normalization enables the detection of copy number aberrations in amplicon sequencing data , 2014, Bioinform..

[89]  F. van Nieuwerburgh,et al.  Library construction for next-generation sequencing: overviews and challenges. , 2014, BioTechniques.

[90]  Heidi L Rehm,et al.  ClinGen--the Clinical Genome Resource. , 2015, The New England journal of medicine.

[91]  Thomas M. Keane,et al.  Enhanced structural variant and breakpoint detection using SVMerge by integration of multiple detection methods and local assembly , 2010, Genome Biology.

[92]  E. Banks,et al.  Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. , 2012, American journal of human genetics.

[93]  Derek Y. Chiang,et al.  High-resolution mapping of copy-number alterations with massively parallel sequencing , 2009, Nature Methods.

[94]  S. Aradhya,et al.  Exon-level array CGH in a large clinical cohort demonstrates increased sensitivity of diagnostic testing for Mendelian disorders , 2012, Genetics in Medicine.

[95]  Jonas Korlach,et al.  Understanding Accuracy in SMRT ® Sequencing , 2013 .

[96]  Jason Li,et al.  CONTRA: copy number analysis for targeted resequencing , 2012, Bioinform..

[97]  Juliane C. Dohm,et al.  Substantial biases in ultra-short read data sets from high-throughput DNA sequencing , 2008, Nucleic acids research.

[98]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[99]  Benedict Paten,et al.  Improved data analysis for the MinION nanopore sequencer , 2015, Nature Methods.

[100]  Tae-Jin Oh,et al.  Advantages of Single-Molecule Real-Time Sequencing in High-GC Content Genomes , 2013, PloS one.

[101]  Todd Richmond,et al.  Detection of Clinically Relevant Copy Number Variants with Whole‐Exome Sequencing , 2013, Human mutation.

[102]  E. Eichler,et al.  Limitations of next-generation genome sequence assembly , 2011, Nature Methods.

[103]  P. Zandi,et al.  Whole-genome CNV analysis: advances in computational approaches , 2015, Front. Genet..

[104]  K. Frazer,et al.  Enrichment of sequencing targets from the human genome by solution hybridization , 2009, Genome Biology.

[105]  Bradley P. Coe,et al.  Copy number variation detection and genotyping from exome sequence data , 2012, Genome research.

[106]  Kenny Q. Ye,et al.  Large-Scale Copy Number Polymorphism in the Human Genome , 2004, Science.