A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction

BACKGROUND Gene-gene interaction (GGI) is one of the most popular approaches for finding the missing heritability of common complex traits in genetic association studies. The multifactor dimensionality reduction (MDR) method has been widely studied for detecting GGIs. In order to identify the best interaction model associated with disease susceptibility, MDR compares all possible genotype combinations in terms of their predictability of disease status from a simple binary high(H) and low(L) risk classification. However, this simple binary classification does not reflect the uncertainty of H/L classification. METHODS We regard classifying H/L as equivalent to defining the degree of membership of two risk groups H/L. By adopting the fuzzy set theory, we propose Fuzzy MDR which takes into account the uncertainty of H/L classification. Fuzzy MDR allows the possibility of partial membership of H/L through a membership function which transforms the degree of uncertainty into a [0,1] scale. The best genotype combinations can be selected which maximizes a new fuzzy set based accuracy measure. RESULTS Two simulation studies are conducted to compare the power of the proposed Fuzzy MDR with that of MDR. Our results show that Fuzzy MDR has higher power than MDR. We illustrate the proposed Fuzzy MDR by analysing bipolar disorder (BD) trait of the WTCCC dataset to detect GGI associated with BD. CONCLUSIONS We propose a novel Fuzzy MDR method to detect gene-gene interaction by taking into account the uncertainly of H/L classification and show that it has higher power than MDR. Fuzzy MDR can be easily extended to handle continuous phenotypes as well. The program written in R for the proposed Fuzzy MDR is available at https://statgen.snu.ac.kr/software/FuzzyMDR.

[1]  J. Buckley,et al.  Fuzzy Mathematics in Finance , 1987 .

[2]  Rongling Wu,et al.  A model-free approach for detecting interactions in genetic association studies , 2014, Briefings Bioinform..

[3]  Manuel A. R. Ferreira,et al.  Collaborative genome-wide association analysis supports a role for ANK3 and CACNA1C in bipolar disorder , 2008, Nature Genetics.

[4]  Doulaye Dembélé,et al.  Fuzzy C-means Method for Clustering Microarray Data , 2003, Bioinform..

[5]  Jin Hee Yoon,et al.  A unified approach to asymptotic behaviors for the autoregressive model with fuzzy data , 2014, Inf. Sci..

[6]  Qiang Yang,et al.  BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies , 2010, American journal of human genetics.

[7]  Seungyeoun Lee,et al.  Gene–gene interaction analysis for the survival phenotype based on the Cox model , 2012, Bioinform..

[8]  J. Rice,et al.  Two‐Locus models of disease , 1992, Genetic epidemiology.

[9]  S. Barro,et al.  Fuzzy Logic in Medicine , 2002 .

[10]  Jin Hee Yoon,et al.  Forecasting using F-transform based on bootstrap technique , 2014, 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[11]  Kyung-Ah Sohn,et al.  Fast detection of high-order epistatic interactions in genome-wide association studies using information theoretic measure , 2014, Comput. Biol. Chem..

[12]  Jin Hee Yoon,et al.  Fuzzy time series reflecting the fluctuation of historical data , 2010, 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery.

[13]  Wentian Li,et al.  A Complete Enumeration and Classification of Two-Locus Disease Models , 1999, Human Heredity.

[14]  T. Mackay Epistasis and quantitative traits: using model organisms to study gene–gene interactions , 2013, Nature Reviews Genetics.

[15]  Taesung Park,et al.  Odds ratio based multifactor-dimensionality reduction method for detecting gene – gene interactions , 2006 .

[16]  I. Lerner,et al.  Heredity, evolution, and society , 1968 .

[17]  Taesung Park,et al.  New evaluation measures for multifactor dimensionality reduction classifiers in gene-gene interaction analysis , 2009, Bioinform..

[18]  Taesung Park,et al.  Multivariate generalized multifactor dimensionality reduction to detect gene-gene interactions , 2013, BMC Systems Biology.

[19]  Jun Zhu,et al.  A combinatorial approach to detecting gene-gene and gene-environment interactions in family studies. , 2008, American journal of human genetics.

[20]  Mee Young Park,et al.  Penalized logistic regression for detecting gene interactions. , 2008, Biostatistics.

[21]  Qiang Yang,et al.  SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies , 2009, Bioinform..

[22]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[23]  Xiang Zhang,et al.  TEAM: efficient two-locus epistasis tests in human genome-wide association study , 2010, Bioinform..

[24]  Seungyeoun Lee,et al.  Gene-gene interaction analysis for the survival phenotype based on the standardized residuals from parametric regression models , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[25]  Jun Zhu,et al.  A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence. , 2007, American journal of human genetics.

[26]  M. L. Calle,et al.  FAM-MDR: A Flexible Family-Based Multifactor Dimensionality Reduction Technique to Detect Epistasis Using Related Individuals , 2010, PloS one.

[27]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[28]  Juan J. Nieto,et al.  Fuzzy Logic in Medicine and Bioinformatics , 2006, Journal of biomedicine & biotechnology.

[29]  Jun S. Liu,et al.  Bayesian inference of epistatic interactions in case-control studies , 2007, Nature Genetics.

[30]  John Hunter,et al.  Fuzzy interval methods in investment risk appraisal , 2004, Fuzzy Sets Syst..

[31]  Qiang Yang,et al.  Predictive rule inference for epistatic interaction detection in genome-wide association studies , 2010, Bioinform..

[32]  Scott M. Williams,et al.  A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction , 2007, Genetic epidemiology.

[33]  S. Cichon,et al.  A genome-wide association study implicates diacylglycerol kinase eta (DGKH) and several other genes in the etiology of bipolar disorder , 2008, Molecular Psychiatry.

[34]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[35]  Taesung Park,et al.  Multivariate Quantitative Multifactor Dimensionality Reduction for Detecting Gene-Gene Interactions , 2015, Human Heredity.

[36]  S. Gabriel,et al.  Whole-genome association study of bipolar disorder , 2008, Molecular Psychiatry.

[37]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[38]  Trevor J. Hastie,et al.  Genome-wide association analysis by lasso penalized logistic regression , 2009, Bioinform..

[39]  Yi Wang,et al.  Exploration of gene–gene interaction effects using entropy-based methods , 2008, European Journal of Human Genetics.

[40]  Scott M. Williams,et al.  A Simple and Computationally Efficient Approach to Multifactor Dimensionality Reduction Analysis of Gene-Gene Interactions for Quantitative Traits , 2013, PloS one.

[41]  Vladik Kreinovich,et al.  Fuzzy logic and its applications in medicine , 2001, Int. J. Medical Informatics.