isoCNV: in silico optimization of copy number variant detection from targeted or exome sequencing data

Background Accurate copy number variant (CNV) detection is especially challenging for both targeted sequencing (TS) and whole‐exome sequencing (WES) data. To maximize the performance, the parameters of the CNV calling algorithms should be optimized for each specific dataset. This requires obtaining validated CNV information using either multiplex ligation-dependent probe amplification (MLPA) or array comparative genomic hybridization (aCGH). They are gold standard but time-consuming and costly approaches. Results We present isoCNV which optimizes the parameters of DECoN algorithm using only NGS data. The parameter optimization process is performed using an in silico CNV validated dataset obtained from the overlapping calls of three algorithms: CNVkit, panelcn.MOPS and DECoN. We evaluated the performance of our tool and showed that increases the sensitivity in both TS and WES real datasets. Conclusions isoCNV provides an easy-to-use pipeline to optimize DECoN that allows the detection of analysis-ready CNV from a set of DNA alignments obtained under the same conditions. It increases the sensitivity of DECoN without the need for orthogonal methods. isoCNV is available at https://gitlab.com/sequentiateampublic/isocnv .

[1]  Lulin Huang,et al.  Whole exome sequencing identifies mutations of multiple genes in a Chinese cohort of 95 sporadic probands with presumptive retinitis pigmentosa , 2018, Journal of Bio-X Research.

[2]  F. Pasquier,et al.  Alzheimer risk associated with a copy number variation in the complement receptor 1 increasing C3b/C4b binding sites , 2011, Molecular Psychiatry.

[3]  Kali T. Witherspoon,et al.  Excess of rare, inherited truncating mutations in autism , 2015, Nature Genetics.

[4]  N. Marčun Varda,et al.  De Novo KMT2D Heterozygous Frameshift Deletion in a Newborn with a Congenital Heart Anomaly , 2020, Balkan journal of medical genetics : BJMG.

[5]  Nancy R. Zhang,et al.  CODEX2: full-spectrum copy number variation detection by high-throughput DNA sequencing , 2017, Genome Biology.

[6]  Lorena González-Castro,et al.  Free-access copy-number variant detection tools for targeted next-generation sequencing data. , 2019, Mutation research.

[7]  M. Velinov Genomic Copy Number Variations in the Autism Clinic—Work in Progress , 2019, Front. Cell. Neurosci..

[8]  Agus Salim,et al.  Statistical challenges associated with detecting copy number variations with next-generation sequencing , 2012, Bioinform..

[9]  Michael F. Walker,et al.  De novo mutations revealed by whole-exome sequencing are strongly associated with autism , 2012, Nature.

[10]  M. Tartaglia,et al.  Copy number variants in autism spectrum disorders , 2019, Progress in Neuro-Psychopharmacology and Biological Psychiatry.

[11]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[12]  Nazneen Rahman,et al.  Accurate clinical detection of exon copy number variants in a targeted NGS panel using DECoN , 2016, Wellcome open research.

[13]  Andrew Collins,et al.  Exome sequence read depth methods for identifying copy number changes , 2015, Briefings Bioinform..

[14]  Bradley P. Coe,et al.  Copy number variation detection and genotyping from exome sequence data , 2012, Genome research.

[15]  Anthony M. Zador,et al.  Sources of PCR-induced distortions in high-throughput sequencing data sets , 2014, bioRxiv.

[16]  Jessica A. Weber,et al.  The Sentieon Genomics Tools – A fast and accurate solution to variant calling from next-generation sequence data , 2017, bioRxiv.

[17]  H. Houlden,et al.  Targeted next-generation sequencing panels in the diagnosis of Charcot-Marie-Tooth disease , 2019, Neurology.

[18]  Eric Talevich,et al.  CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing , 2016, PLoS Comput. Biol..

[19]  M. Hurles,et al.  Copy number variation in human health, disease, and evolution. , 2009, Annual review of genomics and human genetics.

[20]  S. Seal,et al.  The ICR96 exon CNV validation series: a resource for orthogonal assessment of exon CNV calling in NGS data , 2017, Wellcome Open Research.

[21]  Hanxin Lin,et al.  Clinical Validation of Copy Number Variant Detection from Targeted Next-Generation Sequencing Panels. , 2017, The Journal of molecular diagnostics : JMD.

[22]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[23]  O. Ohara,et al.  Whole-Exome Sequencing-Based Approach for Germline Mutations in Patients with Inborn Errors of Immunity , 2020, Journal of Clinical Immunology.

[24]  Brent S. Pedersen,et al.  Pybedtools: a flexible Python library for manipulating genomic datasets and annotations , 2011, Bioinform..

[25]  Nicholas W. Wood,et al.  A robust model for read count data in exome sequencing experiments and implications for copy number variant calling , 2012, Bioinform..

[26]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[27]  Ying Sheng,et al.  Identification of copy number variants from exome sequence data , 2014, BMC Genomics.

[28]  Hideaki Sugawara,et al.  The Sequence Read Archive , 2010, Nucleic Acids Res..

[29]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[30]  Bernat Gel,et al.  Evaluation of CNV detection tools for NGS panel data in genetic diagnostics , 2020, European Journal of Human Genetics.

[31]  Jemma B. Wilk,et al.  Copy Number Variation in Familial Parkinson Disease , 2011, PloS one.

[32]  S. Cavallaro,et al.  Copy number variability in Parkinson’s disease: assembling the puzzle through a systems biology approach , 2016, Human Genetics.

[33]  Sepp Hochreiter,et al.  panelcn.MOPS: Copy‐number detection in targeted NGS panel data for clinical diagnostics , 2017, Human mutation.

[34]  Véronique Geoffroy,et al.  AnnotSV: an integrated tool for structural variations annotation , 2018, Bioinform..