FastEpistasis: a high performance computing solution for quantitative trait epistasis

Motivation: Genome-wide association studies have become widely used tools to study effects of genetic variants on complex diseases. While it is of great interest to extend existing analysis methods by considering interaction effects between pairs of loci, the large number of possible tests presents a significant computational challenge. The number of computations is further multiplied in the study of gene expression quantitative trait mapping, in which tests are performed for thousands of gene phenotypes simultaneously. Results: We present FastEpistasis, an efficient parallel solution extending the PLINK epistasis module, designed to test for epistasis effects when analyzing continuous phenotypes. Our results show that the algorithm scales with the number of processors and offers a reduction in computation time when several phenotypes are analyzed simultaneously. FastEpistasis is capable of testing the association of a continuous trait with all single nucleotide polymorphism (SNP) pairs from 500 000 SNPs, totaling 125 billion tests, in a population of 5000 individuals in 29, 4 or 0.5 days using 8, 64 or 512 processors. Availability: FastEpistasis is open source and available free of charge only for non-commercial users from http://www.vital-it.ch/software/FastEpistasis Contact: karen.kapur@unil.ch Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[2]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[3]  K. Roeder,et al.  Screen and clean: a tool for identifying interactions in genome‐wide association studies , 2010, Genetic epidemiology.

[4]  F. Morón,et al.  A method for detecting epistasis in genome-wide studies using case-control multi-locus association analysis , 2008, BMC Genomics.

[5]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[6]  J. Hein,et al.  Using biological networks to search for interacting loci in genome-wide association studies , 2009, European Journal of Human Genetics.

[7]  David Curtis Allelic association studies of genome wide association data can reveal errors in marker position assignments , 2007, BMC Genetics.

[8]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[9]  Jason H. Moore,et al.  Multifactor dimensionality reduction for graphics processing units enables genome-wide testing of epistasis in sporadic ALS , 2010, Bioinform..

[10]  Tim Becker,et al.  INTERSNP: genome-wide interaction analysis guided by a priori information , 2009, Bioinform..

[11]  P. Phillips Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems , 2008, Nature Reviews Genetics.

[12]  Lude Franke,et al.  eQTL analysis in humans. , 2009, Methods in molecular biology.

[13]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.