GPU-accelerated exhaustive search for third-order epistatic interactions in case-control studies

Abstract Interest in discovering combinations of genetic markers from case–control studies, such as Genome Wide Association Studies (GWAS), that are strongly associated to diseases has increased in recent years. Detecting epistasis, i.e. interactions among k markers ( k  ≥ 2), is an important but time consuming operation since statistical computations have to be performed for each k -tuple of measured markers. Efficient exhaustive methods have been proposed for k  = 2, but exhaustive third-order analyses are thought to be impractical due to the cubic number of triples to be computed. Thus, most previous approaches apply heuristics to accelerate the analysis by discarding certain triples in advance. Unfortunately, these tools can fail to detect interesting interactions. We present GPU3SNP, a fast GPU-accelerated tool to exhaustively search for interactions among all marker-triples of a given case–control dataset. Our tool is able to analyze an input dataset with tens of thousands of markers in reasonable time thanks to two efficient CUDA kernels and efficient workload distribution techniques. For instance, a dataset consisting of 50,000 markers measured from 1000 individuals can be analyzed in less than 22 h on a single compute node with 4 NVIDIA GTX Titan boards. Source code is available at: http://sourceforge.net/projects/gpu3snp/ .

[1]  Jason H. Moore,et al.  BIOINFORMATICS REVIEW , 2005 .

[2]  Qiang Yang,et al.  MegaSNPHunter: a learning approach to detect disease predisposition SNPs and high level interactions in genome wide association study , 2009, BMC Bioinformatics.

[3]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[4]  Qiang Yang,et al.  BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies , 2010, American journal of human genetics.

[5]  Thomas Gerstner,et al.  Feasible and Successful: Genome-Wide Interaction Analysis Involving All 1.9 × 1011 Pair-Wise Interaction Tests , 2010, Human Heredity.

[6]  R. Culverhouse,et al.  The Use of the Restricted Partition Method with Case-Control Data , 2007, Human Heredity.

[7]  Jiang Gui,et al.  A Robust Multifactor Dimensionality Reduction Method for Detecting Gene–Gene Interactions with Application to the Genetic Analysis of Bladder Cancer Susceptibility , 2011, Annals of human genetics.

[8]  Bertil Schmidt,et al.  Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS , 2014, Euro-Par.

[9]  Kyung-Ah Sohn,et al.  Fast detection of high-order epistatic interactions in genome-wide association studies using information theoretic measure , 2014, Comput. Biol. Chem..

[10]  B. Maher Personal genomes: The case of the missing heritability , 2008, Nature.

[11]  Chris S. Haley,et al.  EpiGPU: exhaustive pairwise epistasis scans parallelized on consumer level graphics cards , 2011, Bioinform..

[12]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[13]  R. Jiang,et al.  Epistatic Module Detection for Case-Control Studies: A Bayesian Model with a Gibbs Sampling Strategy , 2009, PLoS genetics.

[14]  Qiang Yang,et al.  Predictive rule inference for epistatic interaction detection in genome-wide association studies , 2010, Bioinform..

[15]  Jun S. Liu,et al.  Bayesian inference of epistatic interactions in case-control studies , 2007, Nature Genetics.

[16]  Guimei Liu,et al.  An empirical comparison of several recent epistatic interaction detection methods , 2011, Bioinform..

[17]  Ting Hu,et al.  An information-gain approach to detecting three-way epistatic interactions in genetic association studies , 2013, J. Am. Medical Informatics Assoc..

[18]  Ioannis Xenarios,et al.  FastEpistasis: a high performance computing solution for quantitative trait epistasis , 2010, Bioinform..

[19]  Lin He,et al.  SHEsisEpi, a GPU-enhanced genome-wide SNP-SNP interaction scanning algorithm, efficiently reveals the risk genetic epistasis in bipolar disorder , 2010, Cell Research.

[20]  M. Steinbach,et al.  High-Order SNP Combinations Associated with Complex Diseases: Efficient Discovery, Statistical Power and Functional Interactions , 2012, PloS one.

[21]  Bertil Schmidt,et al.  UPC++ for bioinformatics: A case study using genome-wide association studies , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).

[22]  Tao Jiang,et al.  Detecting genome-wide epistases based on the clustering of relatively frequent items , 2012, Bioinform..

[23]  Bertil Schmidt,et al.  FPGA-based Acceleration of Detecting Statistical Epistasis in GWAS , 2014, ICCS.

[24]  Cheng Soon Ong,et al.  GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS , 2013, BMC Genomics.

[25]  M. L. Calle,et al.  Model‐Based Multifactor Dimensionality Reduction for detecting epistasis in case–control data in the presence of noise , 2011, Annals of human genetics.

[26]  Can Yang,et al.  GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies , 2011, Bioinform..

[27]  C. Sing,et al.  A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. , 2001, Genome research.

[28]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[29]  Li Ma,et al.  Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies , 2008, BMC Bioinformatics.

[30]  J. Piriyapongsa,et al.  iLOCi: a SNP interaction prioritization technique for detecting epistasis in genome-wide association studies , 2012, BMC Genomics.

[31]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[32]  Blaz Zupan,et al.  Heterogeneous computing architecture for fast detection of SNP-SNP interactions , 2014, BMC Bioinformatics.