Noise-robust assessment of SNP array based CNV calls through local noise estimation of log R ratios

Abstract Arrays based on single nucleotide polymorphisms (SNPs) have been successful for the large scale discovery of copy number variants (CNVs). However, current CNV calling algorithms still have limitations in detecting CNVs with high specificity and sensitivity, especially in case of small (<100 kb) CNVs. Therefore, this study presents a simple statistical analysis to evaluate CNV calls from SNP arrays in order to improve the noise-robustness of existing CNV calling algorithms. The proposed approach estimates local noise of log R ratios and returns the probability that a certain observation is different from this log R ratio noise level. This probability can be triggered at different thresholds to tailor specificity and/or sensitivity in a flexible way. Moreover, a comparison based on qPCR experiments showed that the proposed noise-robust CNV calls outperformed original ones for multiple threshold values.

[1]  Vasilis Z. Marmarelis,et al.  Nonlinear Dynamic Modeling of Physiological Systems , 2004 .

[2]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.

[3]  A. Valsesia,et al.  The Growing Importance of CNVs: New Insights for Detection and Clinical Interpretation , 2013, Front. Genet..

[4]  S Purcell,et al.  De novo CNV analysis implicates specific abnormalities of postsynaptic signalling complexes in the pathogenesis of schizophrenia , 2011, Molecular Psychiatry.

[5]  L. Lin,et al.  New quality measure for SNP array based CNV detection , 2016, Bioinform..

[6]  X. Estivill,et al.  Copy number variants and genetic traits: closer to the resolution of phenotypic to genotypic variability , 2007, Nature Reviews Genetics.

[7]  J. Sikela,et al.  A survey of analysis software for array-comparative genomic hybridisation studies to detect copy number variation , 2010, Human Genomics.

[8]  G Mortier,et al.  Emerging patterns of cryptic chromosomal imbalance in patients with idiopathic mental retardation and multiple congenital anomalies: a new series of 140 patients and review of published reports , 2006, Journal of Medical Genetics.

[9]  Xavier Robin,et al.  pROC: an open-source package for R and S+ to analyze and compare ROC curves , 2011, BMC Bioinformatics.

[10]  A. Ekici,et al.  The clinical significance of small copy number variants in neurodevelopmental disorders , 2014, Journal of Medical Genetics.

[11]  H. Fiegler,et al.  Guidelines for molecular karyotyping in constitutional genetic diagnosis , 2007, European Journal of Human Genetics.

[12]  Jessica R. Wolff,et al.  Microduplications of 16p11.2 are Associated with Schizophrenia , 2009, Nature Genetics.

[13]  J. Vermeesch,et al.  Genome‐wide arrays: Quality criteria and platforms to be used in routine diagnostics , 2012, Human mutation.

[14]  Bradley P. Coe,et al.  Genome structural variation discovery and genotyping , 2011, Nature Reviews Genetics.

[15]  Niklas Krumm,et al.  Transmission disequilibrium of small CNVs in simplex autism. , 2013, American journal of human genetics.

[16]  Seang-Mei Saw,et al.  Comparative analyses of seven algorithms for copy number variant identification from single nucleotide polymorphism arrays , 2010, Nucleic acids research.

[17]  J. Bilbao,et al.  Accuracy in Copy Number Calling by qPCR and PRT: A Matter of DNA , 2011, PloS one.

[18]  Marco A. Marra,et al.  Assessment of algorithms for high throughput detection of genomic copy number variation in oligonucleotide microarray data , 2007, BMC Bioinformatics.

[19]  Geert Vandeweyer,et al.  CNV-WebStore: Online CNV Analysis, Storage and Interpretation , 2011, BMC Bioinformatics.

[20]  P. Hou,et al.  Comparison of Multiple Methods for Determination of FCGR3A/B Genomic Copy Numbers in HapMap Asian Populations with Two Public Databases , 2016, Front. Genet..

[21]  J. Vermeesch,et al.  Genomic microarrays: a technology overview , 2012, Prenatal diagnosis.

[22]  Christopher Yau,et al.  Comparing CNV detection methods for SNP arrays. , 2009, Briefings in functional genomics & proteomics.

[23]  Ryan Mills,et al.  Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants , 2011, Nature Biotechnology.

[24]  R. Sathishkumar,et al.  Stress-Induced Accumulation of DcAOX1 and DcAOX2a Transcripts Coincides with Critical Time Point for Structural Biomass Prediction in Carrot Primary Cultures (Daucus carota L.) , 2016, Front. Genet..

[25]  Iuliana Ionita-Laza,et al.  Genetic association analysis of copy-number variation (CNV) in human disease pathogenesis. , 2009, Genomics.

[26]  E. Seiser,et al.  Hidden Markov Model-Based CNV Detection Algorithms for Illumina Genotyping Microarrays , 2014, Cancer informatics.

[27]  Adrian W. Briggs,et al.  Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA , 2009, Nucleic acids research.

[28]  Avi Ma'ayan,et al.  Identification of small exonic CNV from whole-exome sequence data and application to autism spectrum disorder. , 2013, American journal of human genetics.

[29]  R. Andrews,et al.  Exon array CGH: detection of copy-number changes at the resolution of individual exons in the human genome. , 2005, American journal of human genetics.

[30]  Qingguo Wang,et al.  Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives , 2013, BMC Bioinformatics.

[31]  Xia Jiang,et al.  Modeling the Altered Expression Levels of Genes on Signaling Pathways in Tumors As Causal Bayesian Networks , 2014, Cancer informatics.

[32]  Elizabeth J. Atkinson,et al.  Software comparison for evaluating genomic copy number variation for Affymetrix 6.0 SNP array platform , 2011, BMC Bioinformatics.

[33]  Xiangdong Ding,et al.  Genome-wide detection of copy number variations using high-density SNP genotyping platforms in Holsteins , 2013, BMC Genomics.

[34]  D. Vandermeulen,et al.  Dysmorphometrics: the modelling of morphological abnormalities , 2012, Theoretical Biology and Medical Modelling.

[35]  Gary D Bader,et al.  Functional impact of global rare copy number variation in autism spectrum disorders , 2010, Nature.