Parallel Pairwise Epistasis Detection on Heterogeneous Computing Architectures

Development of new methods to detect pairwise epistasis, such as SNP-SNP interactions, in Genome-Wide Association Studies is an important task in bioinformatics as they can help to explain genetic influences on diseases. As these studies are time consuming operations, some tools exploit the characteristics of different hardware accelerators (such as GPUs and Xeon Phi coprocessors) to reduce the runtime. Nevertheless, all these approaches are not able to efficiently exploit the whole computational capacity of modern clusters that contain both GPUs and Xeon Phi coprocessors. In this paper we investigate approaches to map pairwise epistasic detection on heterogeneous clusters using both types of accelerators. The runtimes to analyze the well-known WTCCC dataset consisting of about 500 K SNPs and 5 K samples on one and two NVIDIA K20m are reduced by 27 percent thanks to the use of a hybrid approach with one additional single Xeon Phi coprocessor.

[1]  Gary K. Chen,et al.  Discovering epistasis in large scale genetic association studies by exploiting graphics cards , 2013, Front. Genet..

[2]  J. Piriyapongsa,et al.  iLOCi: a SNP interaction prioritization technique for detecting epistasis in genome-wide association studies , 2012, BMC Genomics.

[3]  Jason H. Moore,et al.  BIOINFORMATICS REVIEW , 2005 .

[4]  Xiang Zhang,et al.  TEAM: efficient two-locus epistasis tests in human genome-wide association study , 2010, Bioinform..

[5]  Guimei Liu,et al.  An empirical comparison of several recent epistatic interaction detection methods , 2011, Bioinform..

[6]  Kristel Van Steen,et al.  Travelling the world of gene-gene interactions , 2012, Briefings Bioinform..

[7]  Jun Wang,et al.  MICA: A fast short-read aligner that takes full advantage of Many Integrated Core Architecture (MIC) , 2014, BMC Bioinformatics.

[8]  Qiang Yang,et al.  BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies , 2010, American journal of human genetics.

[9]  Qiang Yang,et al.  SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies , 2009, Bioinform..

[10]  P. Phillips Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems , 2008, Nature Reviews Genetics.

[11]  George Chrysos,et al.  Intel® Xeon Phi coprocessor (codename Knights Corner) , 2012, 2012 IEEE Hot Chips 24 Symposium (HCS).

[12]  Bertil Schmidt,et al.  FPGA-based Acceleration of Detecting Statistical Epistasis in GWAS , 2014, ICCS.

[13]  Katherine A. Yelick,et al.  Tuning collective communication for Partitioned Global Address Space programming models , 2011, Parallel Comput..

[14]  K. Roeder,et al.  Screen and clean: a tool for identifying interactions in genome‐wide association studies , 2010, Genetic epidemiology.

[15]  Bertil Schmidt,et al.  Large-scale genome-wide association studies on a GPU cluster using a CUDA-accelerated PGAS programming model , 2015, Int. J. High Perform. Comput. Appl..

[16]  B. Schölkopf,et al.  GLIDE: GPU-Based Linear Regression for Detection of Epistasis , 2012, Human Heredity.

[17]  Lars Koesterke,et al.  Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi , 2013, 2013 42nd International Conference on Parallel Processing.

[18]  Jinbo Bi,et al.  Comparing the utility of homogeneous subtypes of cocaine use and related behaviors with DSM‐IV cocaine dependence as traits for genetic association analysis , 2014, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[19]  Sabela Ramos,et al.  Modeling communication in cache-coherent SMP systems: a case-study with Xeon Phi , 2013, HPDC.

[20]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[21]  Marylyn D. Ritchie,et al.  Generating Linkage Disequilibrium Patterns in Data Simulations Using genomeSIMLA , 2008, EvoBIO.

[22]  I. Pe’er,et al.  Ultrafast genome-wide scan for SNP–SNP interactions in common complex disease , 2012, Genome research.

[23]  B Pütz,et al.  Cost-effective GPU-Grid for Genome-wide Epistasis Calculations , 2012, Methods of Information in Medicine.

[24]  Katherine Yelick,et al.  UPC: Distributed Shared-Memory Programming , 2003 .

[25]  Blaz Zupan,et al.  Heterogeneous computing architecture for fast detection of SNP-SNP interactions , 2014, BMC Bioinformatics.

[26]  Bertil Schmidt,et al.  GPU-accelerated exhaustive search for third-order epistatic interactions in case-control studies , 2015, J. Comput. Sci..

[27]  Bertil Schmidt,et al.  Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS , 2014, Euro-Par.

[28]  B. Maher Personal genomes: The case of the missing heritability , 2008, Nature.

[29]  Yang Zhao,et al.  A genome-wide gene-gene interaction analysis identifies an epistatic gene pair for lung cancer susceptibility in Han Chinese. , 2014, Carcinogenesis.

[30]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[31]  A. Gupta,et al.  Evaluation of Rodinia Codes on Intel Xeon Phi , 2013, 2013 4th International Conference on Intelligent Systems, Modelling and Simulation.

[32]  Cheng Soon Ong,et al.  GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS , 2013, BMC Genomics.

[33]  Julian Peto,et al.  A large-scale assessment of two-way SNP interactions in breast cancer susceptibility using 46,450 cases and 42,461 controls from the breast cancer association consortium. , 2014, Human molecular genetics.

[34]  Katherine A. Yelick,et al.  UPC++: A PGAS Extension for C++ , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[35]  Can Yang,et al.  GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies , 2011, Bioinform..

[36]  Bertil Schmidt,et al.  Parallelizing Epistasis Detection in GWAS on FPGA and GPU-Accelerated Computing Systems , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[37]  Katherine A. Yelick,et al.  Optimizing bandwidth limited problems using one-sided communication and overlap , 2005, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.