Using the Generalized Index of Dissimilarity to Detect Gene-Gene Interactions in Multi-Class Phenotypes

To find genetic association between complex diseases and phenotypic traits, one important procedure is conducting a joint analysis. Multifactor dimensionality reduction (MDR) is an efficient method of examining the interactions between genes in genetic association studies. It commonly assumes a dichotomous classification of the binary phenotypes. Its usual approach to determining the genomic association is to construct a confusion matrix to estimate a classification error, where a binary risk status is determined and assigned to each genotypic multifactor class. While multi-class phenotypes are commonly observed, the current MDR approach does not handle these phenotypes appropriately because the thresholds for the risk statuses may not be clear. In this study, we suggest a new method for estimating gene-gene interactions for multi-class phenotypes. Our approach adopts the index of dissimilarity (IDS) as an evaluation measure. This is analytically equivalent to the common association measure of balanced accuracy (BA) for the binary traits, while it is not required to determine the risk status for the estimation. Moreover, it is easily expandable to the generalized index of dissimilarity (GIDS), which has an explicit form that can handle any number of categories. The performance of the proposed method was compared with those of other approaches via simulation studies in which fifteen genetic models were generated with three class outcomes. A consistently better performance was observed using the proposed method. The effect of a varying number of categories was examined. The proposed method was also illustrated using real genome-wide association studies (GWAS) data from the Korean Association Resource (KARE) project.

[1]  Daniel W. Jones,et al.  Seventh report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. , 2003, Hypertension.

[2]  James M. Sakoda,et al.  A generalized index of dissimilarity , 1981, Demography.

[3]  Nianjun Liu,et al.  Multivariate Dimensionality Reduction Approaches to Identify Gene-Gene and Gene-Environment Interactions Underlying Multiple Complex Traits , 2014, PloS one.

[4]  Tien Yin Wong,et al.  Meta-analysis of genome-wide association studies identifies common variants associated with blood pressure variation in east Asians , 2011, Nature Genetics.

[5]  Scott M. Williams,et al.  A Simple and Computationally Efficient Approach to Multifactor Dimensionality Reduction Analysis of Gene-Gene Interactions for Quantitative Traits , 2013, PloS one.

[6]  Taesung Park,et al.  Multivariate generalized multifactor dimensionality reduction to detect gene-gene interactions , 2013, BMC Systems Biology.

[7]  Taesung Park,et al.  New evaluation measures for multifactor dimensionality reduction classifiers in gene-gene interaction analysis , 2009, Bioinform..

[8]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[9]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[10]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[11]  Tom R. Gaunt,et al.  Genetic Variants in Novel Pathways Influence Blood Pressure and Cardiovascular Disease Risk , 2011, Nature.

[12]  Taesung Park,et al.  Odds ratio based multifactor-dimensionality reduction method for detecting gene – gene interactions , 2006 .

[13]  Jason H Moore,et al.  Computational analysis of gene-gene interactions using multifactor dimensionality reduction , 2004, Expert review of molecular diagnostics.

[14]  Jun Zhu,et al.  A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence. , 2007, American journal of human genetics.

[15]  Seungyeoun Lee,et al.  Gene–gene interaction analysis for the survival phenotype based on the Cox model , 2012, Bioinform..

[16]  Woncheol Jang,et al.  How accurately can we control the FDR in analyzing microarray data? , 2006, Bioinform..

[17]  David M. Evans,et al.  Two-Stage Two-Locus Models in Genome-Wide Association , 2006, PLoS genetics.

[18]  Scott M. Williams,et al.  A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction , 2007, Genetic epidemiology.

[19]  Benjamin J. Epstein,et al.  Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure , 2007 .

[20]  Paul R. Cohen,et al.  Multiple Comparisons in Induction Algorithms , 2000, Machine Learning.

[21]  Min-Seok Kwon,et al.  Detecting Genetic Interactions for Quantitative Traits Using m-Spacing Entropy Measure , 2015, BioMed research international.

[22]  Marylyn D. Ritchie,et al.  Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction , 2008, BMC Bioinformatics.

[23]  Alison A Motsinger,et al.  Multifactor dimensionality reduction: An analysis strategy for modelling and detecting gene - gene interactions in human genetics and pharmacogenomics studies , 2006, Human Genomics.

[24]  Taesung Park,et al.  A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits , 2009, Nature Genetics.

[25]  Charles Rotimi,et al.  A Genome-Wide Association Study of Hypertension and Blood Pressure in African Americans , 2009, PLoS genetics.

[26]  Taesung Park,et al.  Identification of multiple gene-gene interactions for ordinal phenotypes , 2013, BMC Medical Genomics.

[27]  Taesung Park,et al.  Multivariate Quantitative Multifactor Dimensionality Reduction for Detecting Gene-Gene Interactions , 2015, Human Heredity.

[28]  E. Mannarino,et al.  Assessing cardiovascular risk: should we discard diastolic blood pressure? , 2009, Circulation.

[29]  Sin-Ho Jung,et al.  Sample size calculation for multiple testing in microarray data analysis. , 2005, Biostatistics.