Evaluation of CNV detection tools for NGS panel data in genetic diagnostics

Although germline copy-number variants (CNVs) are the genetic cause of multiple hereditary diseases, detecting them from targeted next-generation sequencing data (NGS) remains a challenge. Existing tools perform well for large CNVs but struggle with single and multi-exon alterations. The aim of this work is to evaluate CNV calling tools working on gene panel NGS data and their suitability as a screening step before orthogonal confirmation in genetic diagnostics strategies. Five tools (DECoN, CoNVaDING, panelcn.MOPS, ExomeDepth, and CODEX2) were tested against four genetic diagnostics datasets (two in-house and two external) for a total of 495 samples with 231 single and multi-exon validated CNVs. The evaluation was performed using the default and sensitivity-optimized parameters. Results showed that most tools were highly sensitive and specific, but the performance was dataset dependant. When evaluating them in our diagnostics scenario, DECoN and panelcn.MOPS detected all CNVs with the exception of one mosaic CNV missed by DECoN. However, DECoN outperformed panelcn.MOPS specificity achieving values greater than 0.90 when using the optimized parameters. In our in-house datasets, DECoN and panelcn.MOPS showed the highest performance for CNV screening before orthogonal confirmation. Benchmarking and optimization code is freely available at https://github.com/TranslationalBioinformaticsIGTP/CNVbenchmarkeR .

[1]  Nazneen Rahman,et al.  Accurate clinical detection of exon copy number variants in a targeted NGS panel using DECoN , 2016, Wellcome open research.

[2]  Hadley Wickham,et al.  The Split-Apply-Combine Strategy for Data Analysis , 2011 .

[3]  Robert Gentleman,et al.  Software for Computing and Annotating Genomic Ranges , 2013, PLoS Comput. Biol..

[4]  G. Kong,et al.  Gene-based comparative analysis of tools for estimating copy number alterations using whole-exome sequencing data , 2017, Oncotarget.

[5]  M. Hurles,et al.  Copy number variation in human health, disease, and evolution. , 2009, Annual review of genomics and human genetics.

[6]  C. Lázaro,et al.  A comprehensive custom panel design for routine hereditary cancer testing: preserving control, improving diagnostics and revealing a complex variation landscape , 2017, Scientific Reports.

[7]  Qingguo Wang,et al.  Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives , 2013, BMC Bioinformatics.

[8]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[9]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[10]  Nancy R. Zhang,et al.  CODEX2: full-spectrum copy number variation detection by high-throughput DNA sequencing , 2017, Genome Biology.

[11]  Alexander Fedotov,et al.  Atlas-CNV: a validated approach to call single-exon CNVs in the eMERGESeq gene panel , 2018, Genetics in Medicine.

[12]  Joshua S. Paul,et al.  Prevalence and properties of intragenic copy-number variation in Mendelian disease genes , 2018, Genetics in Medicine.

[13]  Virgilio Gómez-Rubio,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[14]  Nicholas W. Wood,et al.  A robust model for read count data in exome sequencing experiments and implications for copy number variant calling , 2012, Bioinform..

[15]  M. Swertz,et al.  CoNVaDING: Single Exon Variation Detection in Targeted NGS Data , 2016, Human mutation.

[16]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[17]  Hanxin Lin,et al.  Clinical Validation of Copy Number Variant Detection from Targeted Next-Generation Sequencing Panels. , 2017, The Journal of molecular diagnostics : JMD.

[18]  Celine S. Hong,et al.  Assessing the reproducibility of exome copy number variations predictions , 2016, Genome Medicine.

[19]  Eric J Duncavage,et al.  Detection of structural DNA variation from next generation sequencing data: a review of informatic approaches. , 2013, Cancer genetics.

[20]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[21]  Alicia Oshlack,et al.  Ximmer: a system for improving accuracy and consistency of CNV calling from exome data , 2018, bioRxiv.

[22]  E. Birney,et al.  Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt , 2009, Nature Protocols.

[23]  Jian Wang,et al.  Assessment of the cPAS-based BGISEQ-500 platform for metagenomic sequencing , 2017, GigaScience.

[24]  Pablo G. Cámara,et al.  Inference of Ancestral Recombination Graphs through Topological Data Analysis , 2015, PLoS Comput. Biol..

[25]  Gautier Koscielny,et al.  Ensembl 2012 , 2011, Nucleic Acids Res..

[26]  Eric Talevich,et al.  CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing , 2016, PLoS Comput. Biol..

[27]  Lorena González-Castro,et al.  Free-access copy-number variant detection tools for targeted next-generation sequencing data. , 2019, Mutation research.

[28]  Sepp Hochreiter,et al.  panelcn.MOPS: Copy‐number detection in targeted NGS panel data for clinical diagnostics , 2017, Human mutation.

[29]  S. Seal,et al.  The ICR96 exon CNV validation series: a resource for orthogonal assessment of exon CNV calling in NGS data , 2017, Wellcome Open Research.

[30]  Todd Richmond,et al.  Detection of Clinically Relevant Copy Number Variants with Whole‐Exome Sequencing , 2013, Human mutation.

[31]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[32]  Matthew S. Lebo,et al.  Detecting Copy Number Variation via Next Generation Technology , 2016, Current Genetic Medicine Reports.

[33]  Agus Salim,et al.  Statistical challenges associated with detecting copy number variations with next-generation sequencing , 2012, Bioinform..