Epistasis analysis using information theory.

Here we introduce entropy-based measures derived from information theory for detecting and characterizing epistasis in genetic association studies. We provide a general overview of the methods and highlight some of the modifications that have greatly improved its power for genetic analysis. We end with a few published studies of complex human diseases that have used these measures.

[1]  David M. Miller,et al.  Computational inference of the molecular logic for synaptic connectivity in C. elegans , 2006, ISMB.

[2]  Ivan Bratko,et al.  Analyzing Attribute Dependencies , 2003, PKDD.

[3]  Scott M. Williams,et al.  A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction , 2007, Genetic epidemiology.

[4]  David V Conti,et al.  A testing framework for identifying susceptibility genes in the presence of epistasis. , 2006, American journal of human genetics.

[5]  Jason H. Moore,et al.  Environmental Sensing of Expert Knowledge in a Computational Evolution System for Complex Problem Solving in Human Genetics , 2010 .

[6]  Arthur C. Sanderson,et al.  Bladder cancer SNP panel predicts susceptibility and survival , 2009, Human Genetics.

[7]  D. West Introduction to Graph Theory , 1995 .

[8]  William J. McGill Multivariate information transmission , 1954, Trans. IRE Prof. Group Inf. Theory.

[9]  Ting Hu,et al.  Functional genomics annotation of a statistical epistasis network associated with bladder cancer susceptibility , 2014, BioData Mining.

[10]  Jason H. Moore,et al.  Development and Evaluation of an Open-Ended Computational Evolution System for the Genetic Analysis of Susceptibility to Common Human Diseases , 2008, EvoBIO.

[11]  Ting Hu,et al.  An information-gain approach to detecting three-way epistatic interactions in genetic association studies , 2013, J. Am. Medical Informatics Assoc..

[12]  Bill C. White,et al.  Does Complexity Matter? Artificial Evolution, Computational Evolution and the Genetic Analysis of Epistasis in Common Human Diseases. , 2009 .

[13]  Jason H. Moore,et al.  Power of multifactor dimensionality reduction for detecting gene‐gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity , 2003, Genetic epidemiology.

[14]  Todd Holden,et al.  A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. , 2006, Journal of theoretical biology.

[15]  Jason H. Moore,et al.  Evaporative cooling feature selection for genotypic data involving interactions , 2007, Bioinform..

[16]  Gal Chechik,et al.  Group Redundancy Measures Reveal Redundancy Reduction in the Auditory Pathway , 2001, NIPS.

[17]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[18]  Scott M. Williams,et al.  Shadows of complexity: what biological networks reveal about epistasis and pleiotropy , 2009, BioEssays : news and reviews in molecular, cellular and developmental biology.

[19]  Jiang Gui,et al.  A computationally efficient hypothesis testing method for epistasis analysis using multifactor dimensionality reduction , 2009, Genetic epidemiology.

[20]  Jason H. Moore,et al.  The Ubiquitous Nature of Epistasis in Determining Susceptibility to Common Human Diseases , 2003, Human Heredity.

[21]  Jason H Moore Bases, Bits and Disease: Bases, bits and disease: a mathematical theory of human genetics , 2008, European Journal of Human Genetics.

[22]  Ting Hu,et al.  Characterizing genetic interactions in human disease association studies using statistical epistasis networks , 2011, BMC Bioinformatics.

[23]  Yi Wang,et al.  Exploration of gene–gene interaction effects using entropy-based methods , 2008, European Journal of Human Genetics.

[24]  P. Phillips The language of gene interaction. , 1998, Genetics.

[25]  Jason H Moore,et al.  Computational analysis of gene-gene interactions using multifactor dimensionality reduction , 2004, Expert review of molecular diagnostics.

[26]  Scott M. Williams,et al.  Epistasis and its implications for personal genetics. , 2009, American journal of human genetics.

[27]  C I Amos,et al.  Entropy‐based information gain approaches to detect and to characterize gene‐gene and gene‐environment interactions/correlations of complex diseases , 2011, Genetic epidemiology.

[28]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[29]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[30]  Pritam Chanda,et al.  Statistical Applications in Genetics and Molecular Biology Information Metrics in Genetic Epidemiology , 2011 .

[31]  A Zhang,et al.  Modeling of environmental and genetic interactions with AMBROSIA, an information-theoretic model synthesis method , 2011, Heredity.

[32]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[33]  Ting Hu,et al.  ViSEN: Methodology and Software for Visualization of Statistical Epistasis Networks , 2013, Genetic epidemiology.

[34]  Margaret R Karagas,et al.  Concordance of multiple analytical approaches demonstrates a complex relationship between DNA repair gene SNPs, smoking and bladder cancer susceptibility. , 2006, Carcinogenesis.

[35]  Scott M. Williams,et al.  Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. , 2005, BioEssays : news and reviews in molecular, cellular and developmental biology.

[36]  Jason H. Moore,et al.  Role of genetic heterogeneity and epistasis in bladder cancer susceptibility and outcome: a learning classifier system approach , 2013, J. Am. Medical Informatics Assoc..

[37]  Jason H. Moore,et al.  Layers of epistasis: genome‐wide regulatory networks and network approaches to genome‐wide association studies , 2011, Wiley interdisciplinary reviews. Systems biology and medicine.

[38]  P. Chanda,et al.  Comparison of information-theoretic to statistical methods for gene-gene interactions in the presence of genetic heterogeneity , 2010, BMC Genomics.

[39]  Jason H. Moore,et al.  Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions , 2003, Bioinform..

[40]  Yijun Zuo,et al.  An entropy-based approach for testing genetic epistasis underlying complex diseases. , 2008, Journal of theoretical biology.

[41]  Jason H. Moore,et al.  Ideal discrimination of discrete clinical endpoints using multilocus genotypes , 2004, Silico Biol..

[42]  Ingo Ruczinski,et al.  Identifying interacting SNPs using Monte Carlo logic regression , 2005, Genetic epidemiology.

[43]  Cen Wu,et al.  Genetic Association Studies: An Information Content Perspective , 2012, Current genomics.

[44]  P. Phillips Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems , 2008, Nature Reviews Genetics.

[45]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[46]  D. Anastassiou Computational analysis of the synergy among multiple interacting genes , 2007, Molecular systems biology.

[47]  Jason H. Moore,et al.  Detecting, characterizing, and interpreting nonlinear gene-gene interactions using multifactor dimensionality reduction. , 2010, Advances in genetics.

[48]  Ting Hu,et al.  Statistical Epistasis Networks Reduce the Computational Complexity of Searching Three-Locus Genetic Models , 2012, Pacific Symposium on Biocomputing.

[49]  Martin Mozina,et al.  Orange: data mining toolbox in python , 2013, J. Mach. Learn. Res..

[50]  Aidong Zhang,et al.  Information-theoretic metrics for visualizing gene-environment interactions. , 2007, American journal of human genetics.

[51]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[52]  Jason H. Moore,et al.  A global view of epistasis , 2005, Nature Genetics.

[53]  H. Cordell Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. , 2002, Human molecular genetics.

[54]  C Kooperberg,et al.  Sequence Analysis Using Logic Regression , 2001, Genetic epidemiology.

[55]  Holger Schwender,et al.  Logic regression and its extensions. , 2010, Advances in genetics.

[56]  Jason H. Moore,et al.  BIOINFORMATICS REVIEW , 2005 .

[57]  Jason H. Moore,et al.  Genome-Wide Analysis of Epistasis Using Multifactor Dimensionality Reduction: Feature Selection and Construction in the Domain of Human Genetics , 2009 .