Detection of gene x gene interactions in genome-wide association studies of human population data.

Empirical evidence supporting the commonality of gene × gene interactions, coupled with frequent failure to replicate results from previous association studies, has prompted statisticians to develop methods to handle this important subject. Nonparametric methods have generated intense interest because of their capacity to handle high-dimensional data. Genome-wide association analysis of large-scale SNP data is challenging mathematically and computationally. In this paper, we describe major issues and questions arising from this challenge, along with methodological implications. Data reduction and pattern recognition methods seem to be the new frontiers in efforts to detect gene × gene interactions comprehensively. Currently, there is no single method that is recognized as the ‘best’ for detecting, characterizing, and interpreting gene × gene interactions. Instead, a combination of approaches with the aim of balancing their specific strengths may be the optimal approach to investigate gene × gene interactions in human data.

[1]  G. Mendel,et al.  Mendel's Principles of Heredity , 1910, Nature.

[2]  R. Fisher XV.—The Correlation between Relatives on the Supposition of Mendelian Inheritance. , 1919, Transactions of the Royal Society of Edinburgh.

[3]  C. Cockerham,et al.  An Extension of the Concept of Partitioning Hereditary Variance for Analysis of Covariances among Relatives When Epistasis Is Present. , 1954, Genetics.

[4]  J. V. Bradley Distribution-Free Statistical Tests , 1968 .

[5]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[6]  B. Mittelman,et al.  On control. , 1979, Dental management.

[7]  S Greenland,et al.  Concepts of interaction. , 1980, American journal of epidemiology.

[8]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[9]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[10]  E. Boerwinkle,et al.  The use of measured genotype information in the analysis of quantitative phenotypes in man , 1986, Annals of human genetics.

[11]  E. Boerwinkle,et al.  The use of measured genotype information in the analysis of quantitative phenotypes in man. , 1987, Annals of human genetics.

[12]  E. Boerwinkle,et al.  Genetic architecture of inter-individual variability in apolipoprotein, lipoprotein and lipid phenotypes. , 1987, Ciba Foundation symposium.

[13]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[14]  J. Utans,et al.  Selecting neural network architectures via the prediction risk: application to corporate bond rating prediction , 1991, Proceedings First International Conference on Artificial Intelligence Applications on Wall Street.

[15]  John R. Koza,et al.  Genetic generation of both the weights and architecture for a neural network , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[16]  J. Concato,et al.  The Risk of Determining Risk with Multivariable Models , 1993, Annals of Internal Medicine.

[17]  Peter Bock,et al.  Using a hierarchical approach to avoid over-fitting in early vision , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[18]  J. Cheverud,et al.  Epistasis and its contribution to genetic variance components. , 1995, Genetics.

[19]  M Farrall,et al.  Two-locus maximum lod score analysis of a multifactorial trait: joint consideration of IDDM2 and IDDM4 with IDDM1 in type 1 diabetes. , 1995, American journal of human genetics.

[20]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[21]  J. Concato,et al.  A simulation study of the number of events per variable in logistic regression analysis. , 1996, Journal of clinical epidemiology.

[22]  N. Schork,et al.  Who's afraid of epistasis? , 1996, Nature Genetics.

[23]  Xiao-Li Meng,et al.  POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , 1996 .

[24]  C. Sing,et al.  Genetic architecture of common multifactorial diseases. , 1996, Ciba Foundation symposium.

[25]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[26]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[27]  J. Cheverud,et al.  GENE EFFECTS ON A QUANTITATIVE TRAIT: TWO‐LOCUS EPISTATIC EFFECTS MEASURED AT MICROSATELLITE MARKERS AND AT ESTIMATED QTL , 1997, Evolution; international journal of organic evolution.

[28]  Usama M. Fayyad,et al.  Knowledge Discovery in Databases: An Overview , 1997, ILP.

[29]  Gene-gene interaction between apoB and apoE in determining plasma levels and heterogeneity of apoB-containing lipoproteins , 1998 .

[30]  J. Weber,et al.  Identification of novel susceptibility loci for inflammatory bowel disease on chromosomes 1p, 3q, and 4q: evidence for epistasis between 1p and IBD1. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Nancy J. Cox,et al.  Loci on chromosomes 2 (NIDDM1) and 15 interact to increase susceptibility to diabetes in Mexican Americans , 1999, Nature Genetics.

[32]  Y. Benjamini,et al.  Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics , 1999 .

[33]  J. Parsch,et al.  RNA secondary structure and compensatory evolution. , 1999, Genes & genetic systems.

[34]  A. Clark,et al.  Limits to Prediction of Phenotypes from Knowledge of Genotypes , 2000 .

[35]  Wentian Li,et al.  A Complete Enumeration and Classification of Two-Locus Disease Models , 1999, Human Heredity.

[36]  C. Sing,et al.  Complex adaptive systems and human health: the influence of common genotypes of the apolipoprotein E (ApoE) gene polymorphism and age on the relational order within a field of lipid metabolism traits , 2000, Human Genetics.

[37]  J. Ott,et al.  Selecting SNPs in two‐stage analysis of disease association data: a model‐free approach , 2000, Annals of human genetics.

[38]  Testing for gene-gene interaction controlling total IgE in families from Barbados: evidence of sensitivity regarding linkage heterogeneity among families. , 2001, Genomics.

[39]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[40]  D Curtis,et al.  Use of an artificial neural network to detect association between a disease and multiple marker genotypes , 2001, Annals of human genetics.

[41]  Jae Hyun Kim,et al.  Genetic analysis of a new mouse model for non-insulin-dependent diabetes. , 2001, Genomics.

[42]  C. Sing,et al.  A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. , 2001, Genome research.

[43]  Jurg Ott,et al.  20 Applications of neural networks for gene finding , 2001 .

[44]  C Kooperberg,et al.  Sequence Analysis Using Logic Regression , 2001, Genetic epidemiology.

[45]  J. Ott,et al.  Applications of neural networks for gene finding. , 2001, Advances in genetics.

[46]  J. Ioannidis,et al.  Replication validity of genetic association studies , 2001, Nature Genetics.

[47]  G A Churchill,et al.  Genome-wide epistatic interaction analysis reveals complex genetic determinants of circadian behavior in mice. , 2001, Genome research.

[48]  L. Cardon,et al.  Association study designs for complex diseases , 2001, Nature Reviews Genetics.

[49]  T. Mackay The genetic architecture of quantitative traits. , 2001, Annual review of genetics.

[50]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[51]  J. Ott,et al.  Trimming, weighting, and grouping SNPs in human case-control association studies. , 2001, Genome research.

[52]  G. Zubenko,et al.  D10S1423 identifies a susceptibility locus for Alzheimer's disease in a prospective, longitudinal, double-blind study of asymptomatic individuals , 2001, Molecular Psychiatry.

[53]  David B. Allison,et al.  A mixture model approach for the analysis of microarray gene expression data , 2002 .

[54]  H. Cordell Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. , 2002, Human molecular genetics.

[55]  W. Gauderman Sample size requirements for association studies of gene-gene interaction. , 2002, American journal of epidemiology.

[56]  J. Ott,et al.  Multi-locus interactions predict risk for post-PTCA restenosis: an approach to the genetic analysis of common complex disease , 2002, The Pharmacogenomics Journal.

[57]  T. Reich,et al.  A perspective on epistasis: limits of models displaying no main effect. , 2002, American journal of human genetics.

[58]  J H Moore,et al.  A comparison of combinatorial partitioning and linear regression for the detection of epistatic effects of the ACE I/D and PAI‐1 4G/5G polymorphisms on plasma PAI‐1 levels , 2002, Clinical genetics.

[59]  M. Garrett,et al.  A genome scan for Loci associated with aerobic running capacity in rats. , 2002, Genomics.

[60]  Dmitri V Zaykin,et al.  Multiple tests for genetic effects in association studies. , 2002, Methods in molecular biology.

[61]  J. H. Moore,et al.  The relationship between plasma t‐PA and PAI‐1 levels is dependent on epistatic effects of the ACE I/D and PAI‐1 4G/5G polymorphisms , 2002, Clinical genetics.

[62]  J. Hirschhorn,et al.  A comprehensive review of genetic association studies , 2002, Genetics in Medicine.

[63]  Jason H. Moore,et al.  A Cellular Automata Approach to Detecting Interactions Among Single-nucleotide Polymorphisms in Complex Multifactorial Diseases , 2001, Pacific Symposium on Biocomputing.

[64]  Minerva M. Carrasquillo,et al.  Genome-wide association study and mouse model identify interaction between RET and EDNRB pathways in Hirschsprung disease , 2002, Nature Genetics.

[65]  Scott M. Williams,et al.  New strategies for identifying gene-gene interactions in hypertension , 2002, Annals of medicine.

[66]  M. Xiong,et al.  Linkage Disequilibrium Mapping of Quantitative Trait Loci under Truncation Selection , 2002, Human Heredity.

[67]  Jason H. Moore,et al.  An application of conditional logistic regression and multifactor dimensionality reduction for detecting gene-gene Interactions on risk of myocardial infarction: The importance of model validation , 2004, BMC Bioinformatics.

[68]  David B Allison,et al.  "Are we there yet?": Deciding when one has demonstrated specific genetic causation in complex diseases and quantitative traits. , 2003, American journal of human genetics.

[69]  I. Simpson,et al.  Epistatic interaction between variations in the angiotensin I converting enzyme and angiotensin II type 1 receptor genes in relation to extent of coronary atherosclerosis , 2003, Heart.

[70]  D. Altshuler,et al.  The inherited basis of diabetes mellitus: implications for the genetic analysis of complex traits. , 2003, Annual review of genomics and human genetics.

[71]  J. Ott,et al.  Mathematical multi-locus approaches to localizing complex human trait genes , 2003, Nature Reviews Genetics.

[72]  J. H. Moore,et al.  Multifactor-dimensionality reduction shows a two-locus interaction associated with Type 2 diabetes mellitus , 2004, Diabetologia.

[73]  David Neil Cooper,et al.  Nature encyclopedia of the human genome , 2003 .

[74]  Jurg Ott,et al.  Sum statistics for the joint detection of multiple disease loci in case‐control association studies with SNP markers , 2003, Genetic epidemiology.

[75]  Jason H. Moore,et al.  Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions , 2003, Bioinform..

[76]  T. Quertermous,et al.  Genetic epistasis of adiponectin and PPARγ2 genotypes in modulation of insulin sensitivity: a family-based association study , 2003, Diabetologia.

[77]  Robert Plomin,et al.  Nature Encyclopedia of the Human Genome , 2003 .

[78]  E. Lander,et al.  Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease , 2003, Nature Genetics.

[79]  Hongyu Zhao,et al.  Sample size needed to detect gene-gene interactions using association designs. , 2003, American journal of epidemiology.

[80]  K. Lohman,et al.  Polymorphisms of XRCC1 and XRCC3 genes and susceptibility to breast cancer. , 2003, Cancer letters.

[81]  Bill C White,et al.  Optimization of neural network architecture using genetic programming improves detection and modeling of gene-gene interactions in studies of human diseases , 2003, BMC Bioinformatics.

[82]  Hongyu Zhao,et al.  Interacting genetic loci on chromosomes 20 and 10 influence extreme human obesity. , 2003, American journal of human genetics.

[83]  Jason H. Moore,et al.  Power of multifactor dimensionality reduction for detecting gene‐gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity , 2003, Genetic epidemiology.

[84]  Jason H. Moore,et al.  The Ubiquitous Nature of Epistasis in Determining Susceptibility to Common Human Diseases , 2003, Human Heredity.

[85]  Robert Tibshirani,et al.  Statistical Significance for Genome-Wide Experiments , 2003 .

[86]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[87]  D. Tregouet,et al.  Variation in USF1 shows haplotype effects, gene : gene and gene : environment associations with glucose and lipid parameters in the European Atherosclerosis Research Study II. , 2004, Human molecular genetics.

[88]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[89]  D. Lalo,et al.  Oligogenic combinations associated with breast cancer risk in women under 53 years of age , 2005, Human Genetics.

[90]  Hiroyuki Honda,et al.  Artificial neural network approach for selection of susceptible single nucleotide polymorphisms and construction of prediction model on childhood allergic asthma , 2004, BMC Bioinformatics.

[91]  D. Andersson,et al.  Adaptation to the deleterious effects of antimicrobial drug resistance mutations by compensatory evolution. , 2004, Research in microbiology.

[92]  J. Błasiak,et al.  An association of polymorphism of DNA repair genes XRCC1 and XRCC3 with colorectal cancer. , 2004, Journal of experimental & clinical cancer research : CR.

[93]  Scott M. Williams,et al.  The use of animal models in the study of complex disease: all else is never equal or why do so many human studies fail to replicate animal findings? , 2004, BioEssays : news and reviews in molecular, cellular and developmental biology.

[94]  I. Kockum,et al.  Interaction and association analysis of a type 1 diabetes susceptibility locus on chromosome 5q11-q13 and the 7q32 chromosomal region in Scandinavian families. , 2004, Diabetes.

[95]  Jonathan L Haines,et al.  Genetics, statistics and human disease: analytical retooling for complexity. , 2004, Trends in genetics : TIG.

[96]  Jason H. Moore,et al.  Genetic Programming Neural Networks as a Bioinformatics Tool for Human Genetics , 2004, GECCO.

[97]  Jason H Moore,et al.  Computational analysis of gene-gene interactions using multifactor dimensionality reduction , 2004, Expert review of molecular diagnostics.

[98]  Shinichi Nakagawa A farewell to Bonferroni: the problems of low statistical power and publication bias , 2004, Behavioral Ecology.

[99]  Marylyn D Ritchie,et al.  Renin-Angiotensin System Gene Polymorphisms and Atrial Fibrillation , 2004, Circulation.

[100]  S. Hunt,et al.  Identification of a congenic mouse line with obesity and body length phenotypes , 2004, Mammalian Genome.

[101]  A. Ide,et al.  Association between IL-18 gene promoter polymorphisms and CTLA-4 gene 49A/G polymorphism in Japanese patients with type 1 diabetes. , 2004, Journal of autoimmunity.

[102]  Jason H. Moore,et al.  STUDENTJAMA. The challenges of whole-genome approaches to common diseases. , 2004, JAMA.

[103]  Chris S. Haley,et al.  Epistasis: too often neglected in complex trait studies? , 2004, Nature Reviews Genetics.

[104]  Carl D. Langefeld,et al.  Interaction effect of PTEN and CDKN1B chromosomal regions on prostate cancer linkage , 2003, Human Genetics.

[105]  William Shannon,et al.  Detecting epistatic interactions contributing to quantitative traits , 2004, Genetic epidemiology.

[106]  M. Daly,et al.  New Approaches to Gene Hunting in IBD , 2004, Inflammatory bowel diseases.

[107]  Frank Dudbridge,et al.  Efficient computation of significance levels for multiple associations in large studies of correlated data, including genomewide association studies. , 2004, American journal of human genetics.

[108]  G. Church,et al.  Modular epistasis in yeast metabolism , 2005, Nature Genetics.

[109]  Serge Batalov,et al.  Susceptibility and modifier genes in Portuguese transthyretin V30M amyloid polyneuropathy: complexity in a single-gene disease. , 2005, Human molecular genetics.

[110]  P. Donnelly,et al.  Genome-wide strategies for detecting multiple loci that influence complex diseases , 2005, Nature Genetics.

[111]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.

[112]  Marylyn D. Ritchie,et al.  Can Neural Network Constraints in GP Provide Power to Detect Genes Associated with Human Disease? , 2005, EvoWorkshops.

[113]  D. Clayton,et al.  Genome-wide association studies: theoretical and practical concerns , 2005, Nature Reviews Genetics.

[114]  Ingo Ruczinski,et al.  Identifying interacting SNPs using Monte Carlo logic regression , 2005, Genetic epidemiology.

[115]  E R Martin,et al.  Identification of significant association and gene-gene interaction of GABA receptor subunit genes in autism. , 2005, American journal of human genetics.

[116]  Cheng-Chang Chang,et al.  Genetic polymorphisms of FAS and FASL (CD95/CD95L) genes in cervical carcinogenesis: An analysis of haplotype and gene-gene interaction. , 2005, Gynecologic oncology.

[117]  Tao Wang,et al.  Models and partition of variance for quantitative trait loci with epistasis and linkage disequilibrium , 2006, BMC Genetics.

[118]  Koen J. F. Verhoeven,et al.  Implementing false discovery rate control: increasing your power , 2005 .

[119]  Marylyn D. Ritchie,et al.  GPNN: Power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease , 2006, BMC Bioinformatics.

[120]  Sven Cichon,et al.  Haplotype interaction analysis of unlinked regions , 2005, Genetic epidemiology.

[121]  Oligogenic combinations associated with breast cancer risk in women under 53 years of ageC. Aston, D. Ralph, D. Lalo, S. Manjeshwar, B.A. Gramling, D.C. DeFreese, A.D. West, D.E. Branam, L.F. Thompson, M.A. Craft, et al. Hum Genet 2005;116:208–21 , 2005 .

[122]  Lin He,et al.  An association study of the N-methyl-D-aspartate receptor NR1 subunit gene (GRIN1) and NR2B subunit gene (GRIN2B) in schizophrenia with universal DNA microarray , 2005, European Journal of Human Genetics.

[123]  J. Li,et al.  Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix , 2005, Heredity.

[124]  David Curtis,et al.  Application of Logistic Regression to Case-Control Association Studies Involving Two Causative Loci , 2005, Human Heredity.

[125]  luliana lonita,et al.  Optimal two-stage strategy for detecting interacting genes in complex diseases , 2006, BMC Genetics.

[126]  Alison A Motsinger,et al.  The effect of reduction in cross‐validation intervals on the performance of multifactor dimensionality reduction , 2006, Genetic epidemiology.

[127]  A. G. Heidema,et al.  The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases , 2006, BMC Genetics.

[128]  T. Mark Beasley,et al.  Regional Admixture Mapping and Structured Association Testing: Conceptual Unification and an Extensible General Linear Model , 2006, PLoS genetics.

[129]  Todd Holden,et al.  A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. , 2006, Journal of theoretical biology.

[130]  S. Fisher,et al.  Dissection of epistasis in oligogenic Bardet–Biedl syndrome , 2006, Nature.

[131]  David V Conti,et al.  A testing framework for identifying susceptibility genes in the presence of epistasis. , 2006, American journal of human genetics.

[132]  Lance W. Hahn,et al.  Comparison of Neural Network Optimization Approaches for Studies of Human Genetics , 2006, EvoWorkshops.

[133]  David M. Evans,et al.  Two-Stage Two-Locus Models in Genome-Wide Association , 2006, PLoS genetics.

[134]  J. Todd Statistical false positive or true disease pathway? , 2006, Nature Genetics.

[135]  Margaret R Karagas,et al.  Concordance of multiple analytical approaches demonstrates a complex relationship between DNA repair gene SNPs, smoking and bladder cancer susceptibility. , 2006, Carcinogenesis.

[136]  J. H. Moore,et al.  A novel method to identify gene–gene effects in nuclear families: the MDR‐PDT , 2006, Genetic epidemiology.

[137]  Marylyn D. Ritchie,et al.  GPNN: Power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease , 2006 .

[138]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[139]  Lynne Pearce,et al.  Partners in crime. , 2008, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[140]  L. Penrose,et al.  THE CORRELATION BETWEEN RELATIVES ON THE SUPPOSITION OF MENDELIAN INHERITANCE , 2022 .