EpiGPU: exhaustive pairwise epistasis scans parallelized on consumer level graphics cards

MOTIVATION Hundreds of genome-wide association studies have been performed over the last decade, but as single nucleotide polymorphism (SNP) chip density has increased so has the computational burden to search for epistasis [for n SNPs the computational time resource is O(n(n-1)/2)]. While the theoretical contribution of epistasis toward phenotypes of medical and economic importance is widely discussed, empirical evidence is conspicuously absent because its analysis is often computationally prohibitive. To facilitate resolution in this field, tools must be made available that can render the search for epistasis universally viable in terms of hardware availability, cost and computational time. RESULTS By partitioning the 2D search grid across the multicore architecture of a modern consumer graphics processing unit (GPU), we report a 92× increase in the speed of an exhaustive pairwise epistasis scan for a quantitative phenotype, and we expect the speed to increase as graphics cards continue to improve. To achieve a comparable computational improvement without a graphics card would require a large compute-cluster, an option that is often financially non-viable. The implementation presented uses OpenCL--an open-source library designed to run on any commercially available GPU and on any operating system. AVAILABILITY The software is free, open-source, platformindependent and GPU-vendor independent. It can be downloaded from http://sourceforge.net/projects/epigpu/.

[1]  W. G. Hill,et al.  Heritability in the genomics era — concepts and misconceptions , 2008, Nature Reviews Genetics.

[2]  Brett A. McKinney,et al.  Real-world comparison of CPU and GPU implementations of SNPrank: a network analysis tool for GWAS , 2011, Bioinform..

[3]  Leonid Kruglyak,et al.  The road to genome-wide association studies , 2008, Nature Reviews Genetics.

[4]  Ioannis Xenarios,et al.  FastEpistasis: a high performance computing solution for quantitative trait epistasis , 2010, Bioinform..

[5]  Arie E. Kaufman,et al.  GPU Cluster for High Performance Computing , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[6]  Jason H. Moore,et al.  A global view of epistasis , 2005, Nature Genetics.

[7]  P. Donnelly,et al.  Genome-wide strategies for detecting multiple loci that influence complex diseases , 2005, Nature Genetics.

[8]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[9]  P. Phillips The language of gene interaction. , 1998, Genetics.

[10]  David V Conti,et al.  A testing framework for identifying susceptibility genes in the presence of epistasis. , 2006, American journal of human genetics.

[11]  B. Welford Note on a Method for Calculating Corrected Sums of Squares and Products , 1962 .

[12]  David M. Evans,et al.  Two-Stage Two-Locus Models in Genome-Wide Association , 2006, PLoS genetics.

[13]  George Bosilca,et al.  Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.

[14]  R. Doerge,et al.  Empirical threshold values for quantitative trait mapping. , 1994, Genetics.

[15]  Chris S. Haley,et al.  Epistasis: too often neglected in complex trait studies? , 2004, Nature Reviews Genetics.

[16]  N. Schork,et al.  Who's afraid of epistasis? , 1996, Nature Genetics.

[17]  H. Cordell Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. , 2002, Human molecular genetics.

[18]  Jeff Huskamp Proceedings of the 2004 ACM/IEEE conference on Supercomputing , 2004 .

[19]  W. G. Hill,et al.  Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits , 2008, PLoS genetics.

[20]  C. Haley,et al.  Genomewide Rapid Association Using Mixed Model and Regression: A Fast and Simple Method For Genomewide Pedigree-Based Quantitative Trait Loci Association Analysis , 2007, Genetics.