cuGWAM: Genome-wide association multifactor dimensionality reduction using CUDA-enabled high-performance graphics processing unit

Multifactor dimensionality reduction (MDR) method has been widely applied to detect gene-gene interactions that are well recognized as playing an important role in understanding complex traits, such as disease susceptibility. However, because of an exhaustive analysis of MDR, the current MDR software has some limitations to be extended to the genome-wide association studies (GWAS) with a large number of genetic markers up to ∼1 million. To overcome this computational problem, we developed CUDA based genome-wide association MDR (cuGWAM) software using efficient hardware accelerators. Not only cuGWAM has better performance than CPU-based MDR methods (original MDR and parallel MDR) and GPU-based other methods (MDRGPU), but also initial construction cost is also less expensive. Furthermore, cuGWAM provides various performance measures to evaluate MDR classifiers, including tau-b, likelihood ratio, normalized mutual information as well as balanced accuracy. Also, cuGWAM provided three methods for handling missing genotypes: complete, available and missing category. Executable cuGWAM are freely available at http://bibs.snu.ac.kr/cugwam for system with CUDA-enabled GPU devices.

[1]  Lorenzo Dematté,et al.  GPU computing for systems biology , 2010, Briefings Bioinform..

[2]  Nicolas Pinto,et al.  PyCUDA: GPU Run-Time Code Generation for High-Performance Computing , 2009, ArXiv.

[3]  Gil Atzmon,et al.  Gene-Gene Interactions Lead to Higher Risk for Development of Type 2 Diabetes in an Ashkenazi Jewish Population , 2010, PloS one.

[4]  Taesung Park,et al.  New evaluation measures for multifactor dimensionality reduction classifiers in gene-gene interaction analysis , 2009, Bioinform..

[5]  Jason H. Moore,et al.  Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions , 2003, Bioinform..

[6]  Marylyn D. Ritchie,et al.  Multilocus Analysis of Hypertension: A Hierarchical Approach , 2004, Human Heredity.

[7]  R. Elston,et al.  Identification of gene‐gene interactions in the presence of missing data using the multifactor dimensionality reduction method , 2009, Genetic epidemiology.

[8]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[9]  Yongchao Liu,et al.  CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions , 2010, BMC Research Notes.

[10]  Peter J. Stuckey,et al.  Fast and accurate protein substructure searching with simulated annealing and GPUs , 2010, BMC Bioinformatics.

[11]  Marylyn D. Ritchie,et al.  Parallel multifactor dimensionality reduction: a tool for the large-scale analysis of gene-gene interactions , 2006, Bioinform..

[12]  Fabio Cancare,et al.  Accelerating epistasis analysis in human genetics with consumer graphics hardware , 2009, BMC Research Notes.

[13]  Jason H. Moore,et al.  The Ubiquitous Nature of Epistasis in Determining Susceptibility to Common Human Diseases , 2003, Human Heredity.