VikNGS: a C++ variant integration kit for next generation sequencing association analysis

Abstract Summary Integration of next generation sequencing data (NGS) across different research studies can improve the power of genetic association testing by increasing sample size and can obviate the need for sequencing controls. If differential genotype uncertainty across studies is not accounted for, combining datasets can produce spurious association results. We developed the Variant Integration Kit for NGS (VikNGS), a fast cross-platform software package, to enable aggregation of several datasets for rare and common variant genetic association analysis of quantitative and binary traits with covariate adjustment. VikNGS also includes a graphical user interface, power simulation functionality and data visualization tools. Availability and implementation The VikNGS package can be downloaded at http://www.tcag.ca/tools/index.html. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[2]  Michael R Knowles,et al.  Multiple apical plasma membrane constituents are associated with susceptibility to meconium ileus in individuals with cystic fibrosis , 2012, Nature Genetics.

[3]  Tom R. Gaunt,et al.  The UK10K project identifies rare variants in health and disease , 2016 .

[4]  Andriy Derkach,et al.  Pooled Association Tests for Rare Genetic Variants: A Review and Some New Results , 2012 .

[5]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[6]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[7]  J. Rommens,et al.  Improving imputation in disease-relevant regions: lessons from cystic fibrosis , 2018, npj Genomic Medicine.

[8]  G. Satten,et al.  Testing Rare-Variant Association without Calling Genotypes Allows for Systematic Differences in Sequencing between Cases and Controls , 2015, bioRxiv.

[9]  Xiaowei Zhan,et al.  RVTESTS: an efficient and comprehensive tool for rare variant association analysis using sequence data , 2016, Bioinform..

[10]  G. Abecasis,et al.  Rare-variant association analysis: study designs and statistical tests. , 2014, American journal of human genetics.

[11]  Andriy Derkach,et al.  Association analysis using next-generation sequence data from publicly available control groups: the robust variance score statistic , 2014, Bioinform..

[12]  C. Orme,et al.  On improving the robustness and reliability of Rao's score test , 2001 .

[13]  Anders Albrechtsen,et al.  Association Testing for Next‐Generation Sequencing Data Using Score Statistics , 2012, Genetic epidemiology.

[14]  L. Tsui,et al.  Erratum: Identification of the Cystic Fibrosis Gene: Genetic Analysis , 1989, Science.

[15]  Joshua S. Paul,et al.  Genotype and SNP calling from next-generation sequencing data , 2011, Nature Reviews Genetics.

[16]  Dan-Yu Lin,et al.  A general framework for detecting disease associations with rare variants in sequencing studies. , 2011, American journal of human genetics.

[17]  J. Salzman,et al.  Statistical properties of an early stopping rule for resampling-based multiple testing. , 2012, Biometrika.

[18]  W. Thilly,et al.  A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). , 2007, Mutation research.

[19]  Bo Peng,et al.  Variant association tools for quality control and analysis of large-scale sequence and genotyping array data. , 2014, American journal of human genetics.

[20]  A. Goldberger,et al.  On the Exact Covariance of Products of Random Variables , 1969 .

[21]  L. Strug,et al.  Recent advances in developing therapeutics for cystic fibrosis. , 2018, Human molecular genetics.

[22]  Yi-Juan Hu,et al.  PhredEM: a phred‐score‐informed genotype‐calling approach for next‐generation sequencing studies , 2017, Genetic epidemiology.