A Novel Evolution-Based Method for Detecting Gene-Gene Interactions

Background The rapid advance in large-scale SNP-chip technologies offers us great opportunities in elucidating the genetic basis of complex diseases. Methods for large-scale interactions analysis have been under development from several sources. Due to several difficult issues (e.g., sparseness of data in high dimensions and low replication or validation rate), development of fast, powerful and robust methods for detecting various forms of gene-gene interactions continues to be a challenging task. Methodology/Principal Findings In this article, we have developed an evolution-based method to search for genome-wide epistasis in a case-control design. From an evolutionary perspective, we view that human diseases originate from ancient mutations and consider that the underlying genetic variants play a role in differentiating human population into the healthy and the diseased. Based on this concept, traditional evolutionary measure, fixation index (Fst) for two unlinked loci, which measures the genetic distance between populations, should be able to reveal the responsible genetic interplays for disease traits. To validate our proposal, we first investigated the theoretical distribution of Fst by using extensive simulations. Then, we explored its power for detecting gene-gene interactions via SNP markers, and compared it with the conventional Pearson Chi-square test, mutual information based test and linkage disequilibrium based test under several disease models. The proposed evolution-based method outperformed these compared methods in dominant and additive models, no matter what the disease allele frequencies were. However, its performance was relatively poor in a recessive model. Finally, we applied the proposed evolution-based method to analysis of a published dataset. Our results showed that the P value of the Fst -based statistic is smaller than those obtained by the LD-based statistic or Poisson regression models. Conclusions/Significance With rapidly growing large-scale genetic association studies, the proposed evolution-based method can be a promising tool in the identification of epistatic effects.

[1]  N. Camp,et al.  Classification tree analysis: a statistical tool to investigate risk factor interactions with an example for colon cancer (United States) , 2002, Cancer Causes & Control.

[2]  Carl T. Bergstrom,et al.  Making evolutionary biology a basic science for medicine , 2010, Proceedings of the National Academy of Sciences.

[3]  R. Yang,et al.  Multilocus structure in Pinus contorta Dougl. , 1993, Theoretical and Applied Genetics.

[4]  Jason H. Moore,et al.  BIOINFORMATICS REVIEW , 2005 .

[5]  Yun Xiao,et al.  A systematic method for mapping multiple loci: an application to construct a genetic network for rheumatoid arthritis. , 2008, Gene.

[6]  George C. Williams,et al.  The Dawn of Darwinian Medicine , 1991, The Quarterly Review of Biology.

[7]  Michael Knapp,et al.  Maximum‐likelihood estimation of haplotype frequencies in nuclear families , 2004, Genetic epidemiology.

[8]  F. Morón,et al.  A method for detecting epistasis in genome-wide studies using case-control multi-locus association analysis , 2008, BMC Genomics.

[9]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[10]  Momiao Xiong,et al.  An entropy-based statistic for genomewide association studies. , 2005, American journal of human genetics.

[11]  P. Jansen-Dürr,et al.  A Darwinian-evolutionary concept of age-related diseases , 2003, Experimental Gerontology.

[12]  R. Nesse Evolution: medicine's most basic science , 2008, The Lancet.

[13]  Xue-wen Chen,et al.  A Markov blanket-based method for detecting causal SNPs in GWAS , 2010, BMC Bioinformatics.

[14]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[15]  J. Ott,et al.  Mathematical multi-locus approaches to localizing complex human trait genes , 2003, Nature Reviews Genetics.

[16]  P. Donnelly,et al.  Genome-wide strategies for detecting multiple loci that influence complex diseases , 2005, Nature Genetics.

[17]  Evolutionary explanations in medicine: how do they differ and how to benefit from them. , 2010, Medical hypotheses.

[18]  W. Oetting,et al.  Power of multifactor dimensionality reduction and penalized logistic regression for detecting gene-gene Interaction in a case-control study , 2009, BMC Medical Genetics.

[19]  C. Woese The universal ancestor. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Marylyn D. Ritchie,et al.  Generating Linkage Disequilibrium Patterns in Data Simulations Using genomeSIMLA , 2008, EvoBIO.

[21]  Scott M. Williams,et al.  challenges for genome-wide association studies , 2010 .

[22]  Debbie S. Yuster,et al.  A complete classification of epistatic two-locus models , 2006, BMC Genetics.

[23]  Eric R. Ziegel,et al.  Applied Multivariate Data Analysis , 2002, Technometrics.

[24]  P. Ewald,et al.  Evolutionary biology and the treatment of signs and symptoms of infectious disease. , 1980, Journal of theoretical biology.

[25]  G. R. Fraser,et al.  The mathematics of heredity , 1971 .

[26]  Mario Recker,et al.  Negative epistasis between the malaria-protective effects of α+-thalassemia and the sickle cell trait , 2005, Nature Genetics.

[27]  K. Holsinger,et al.  Genetics in geographically structured populations: defining, estimating and interpreting FST , 2009, Nature Reviews Genetics.

[28]  R. Mägi,et al.  Genetic Structure of Europeans: A View from the North–East , 2009, PloS one.

[29]  R. Nesse,et al.  Evolution and the origins of disease. , 1998, Scientific American.

[30]  M. Xiong,et al.  Test for interaction between two unlinked loci. , 2006, American journal of human genetics.

[31]  Li-Yeh Chuang,et al.  Odds ratio-based genetic algorithms for generating SNP barcodes of genotypes to predict disease susceptibility. , 2008, Omics : a journal of integrative biology.

[32]  David R. Brillinger,et al.  Some data analyses using mutual information , 2004 .

[33]  R. Nesse Proximate and evolutionary studies of anxiety, stress and depression: synergy at the interface , 1999, Neuroscience & Biobehavioral Reviews.

[34]  Meili Xiao,et al.  Horizontal gene transfer in plants , 2013, Functional & Integrative Genomics.

[35]  S. Wright,et al.  Genetical Structure of Populations , 1950, Nature.

[36]  Yang Cheng-Hong,et al.  Odds ratio-based genetic algorithms for generating SNP barcodes of genotypes to predict disease susceptibility. , 2008 .