An overview of SNP interactions in genome-wide association studies.

With the recent explosion in high-throughput genotyping technology, the amount and quality of single-nucleotide polymorphism (SNP) data has increased exponentially. Therefore, the identification of SNP interactions that are associated with common diseases is playing an increasing and important role in interpreting the genetic basis of disease susceptibility and in devising new diagnostic tests and treatments. However, because these data sets are large, although they typically have small sample sizes and low signal-to-noise ratios, there has been no major breakthrough despite many efforts, making this a major focus in the field of bioinformatics. In this article, we review the two main aspects of SNP interaction studies in recent years-the simulation and identification of SNP interactions-and then discuss the principles, efficiency and differences between these methods.

[1]  W. Bateson Mendel's Principles of Heredity , 1910, Nature.

[2]  R. Fisher XV.—The Correlation between Relatives on the Supposition of Mendelian Inheritance. , 1919, Transactions of the Royal Society of Edinburgh.

[3]  C. Cockerham,et al.  An Extension of the Concept of Partitioning Hereditary Variance for Analysis of Covariances among Relatives When Epistasis Is Present. , 1954, Genetics.

[4]  C. J-F,et al.  THE COALESCENT , 1980 .

[5]  R. Hudson,et al.  Statistical properties of the number of recombination events in the history of a sample of DNA sequences. , 1985, Genetics.

[6]  P. Phillips The language of gene interaction. , 1998, Genetics.

[7]  J. Kingman Origins of the coalescent. 1974-1982. , 2000, Genetics.

[8]  M Slatkin,et al.  Simulating genealogies of selected alleles in a population of variable size. , 2001, Genetical research.

[9]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[10]  G. Zubenko,et al.  D10S1423 identifies a susceptibility locus for Alzheimer's disease in a prospective, longitudinal, double-blind study of asymptomatic individuals , 2001, Molecular Psychiatry.

[11]  H. Cordell Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. , 2002, Human molecular genetics.

[12]  Hiroyuki Honda,et al.  Artificial neural network approach for selection of susceptible single nucleotide polymorphisms and construction of prediction model on childhood allergic asthma , 2004, BMC Bioinformatics.

[13]  Ingo Ruczinski,et al.  Exploring interactions in high-dimensional genomic data: an overview of logic regression, with applications , 2004 .

[14]  K. Lunetta,et al.  Screening large-scale association study data: exploiting interactions using random forests , 2004, BMC Genetics.

[15]  Jason H. Moore,et al.  STUDENTJAMA. The challenges of whole-genome approaches to common diseases. , 2004, JAMA.

[16]  P. Donnelly,et al.  Genome-wide strategies for detecting multiple loci that influence complex diseases , 2005, Nature Genetics.

[17]  Jason H. Moore,et al.  A global view of epistasis , 2005, Nature Genetics.

[18]  Marek Kimmel,et al.  simuPOP: a forward-time population genetics simulation environment , 2005, Bioinform..

[19]  Ingo Ruczinski,et al.  Identifying interacting SNPs using Monte Carlo logic regression , 2005, Genetic epidemiology.

[20]  Thomas Mailund,et al.  CoaSim: A flexible environment for simulating genetic data under coalescent models , 2005, BMC Bioinformatics.

[21]  J. Ott,et al.  Complement Factor H Polymorphism in Age-Related Macular Degeneration , 2005, Science.

[22]  K. Lunetta,et al.  Identifying SNPs predictive of phenotype using random forests , 2005, Genetic epidemiology.

[23]  Sven Cichon,et al.  Haplotype interaction analysis of unlinked regions , 2005, Genetic epidemiology.

[24]  Jason H. Moore,et al.  Exploiting Expert Knowledge in Genetic Programming for Genome-Wide Genetic Analysis , 2006, PPSN.

[25]  Judy H. Cho,et al.  A Genome-Wide Association Study Identifies IL23R as an Inflammatory Bowel Disease Gene , 2006, Science.

[26]  Daniel Dvorkin,et al.  Detection of SNP epistasis effects of quantitative traits using an extended Kempthorne model. , 2006, Physiological genomics.

[27]  Katja Ickstadt,et al.  Comparing Logic Regression Based Methods for Identifying SNP Interactions , 2007, BIRD.

[28]  Jun S. Liu,et al.  Bayesian inference of epistatic interactions in case-control studies , 2007, Nature Genetics.

[29]  P. Fearnhead,et al.  Genome-wide association study of prostate cancer identifies a second risk locus at 8q24 , 2007, Nature Genetics.

[30]  Jun Zhu,et al.  A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence. , 2007, American journal of human genetics.

[31]  Taesung Park,et al.  Log-linear model-based multifactor dimensionality reduction method to detect gene-gene interactions , 2007, Bioinform..

[32]  C. Hoggart,et al.  Sequence-Level Population Simulations Over Large Genomic Regions , 2007, Genetics.

[33]  D. Allison,et al.  Detection of gene x gene interactions in genome-wide association studies of human population data. , 2007, Human heredity.

[34]  Scott M. Williams,et al.  A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction , 2007, Genetic epidemiology.

[35]  Jigneshkumar L Patel,et al.  Applications of artificial neural networks in medical science. , 2007, Current clinical pharmacology.

[36]  Fred A. Wright,et al.  Genetics and population analysis Simulating association studies : a data-based resampling method for candidate regions or whole genome scans , 2007 .

[37]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[38]  Gonçalo R. Abecasis,et al.  GENOME: a rapid coalescent-based whole genome simulator , 2007, Bioinform..

[39]  Peter Donnelly,et al.  Progress and challenges in genome-wide association studies in humans , 2008, Nature.

[40]  Chun Li,et al.  GWAsimulator: a rapid whole-genome simulation program , 2007, Bioinform..

[41]  Francis S Collins,et al.  A HapMap harvest of insights into the genetics of common disease. , 2008, The Journal of clinical investigation.

[42]  Jérôme Goudet,et al.  quantiNemo: an individual-based program to simulate quantitative traits with explicit genetic architecture in a dynamic metapopulation , 2008, Bioinform..

[43]  Jason H. Moore,et al.  Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases , 2008, Human Genetics.

[44]  B. Maher,et al.  The case of the missing heritability , 2008 .

[45]  K. Lindblad-Toh,et al.  A deletion in nephronophthisis 4 (NPHP4) is associated with recessive cone-rod dystrophy in standard wire-haired dachshund. , 2008, Genome research.

[46]  Marylyn D. Ritchie,et al.  Generating Linkage Disequilibrium Patterns in Data Simulations Using genomeSIMLA , 2008, EvoBIO.

[47]  Holger Schwender,et al.  Identification of SNP interactions using logic regression. , 2008, Biostatistics.

[48]  Antonio Carvajal-Rodríguez,et al.  Simulation of Genomes: A Review , 2008, Current genomics.

[49]  Mee Young Park,et al.  Penalized logistic regression for detecting gene interactions. , 2008, Biostatistics.

[50]  B. Maher Personal genomes: The case of the missing heritability , 2008, Nature.

[51]  Judy H. Cho,et al.  Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease , 2008, Nature Genetics.

[52]  K. Lindblad-Toh,et al.  A Mutation in Hairless Dogs Implicates FOXI3 in Ectodermal Development , 2008, Science.

[53]  M. LeBlanc,et al.  Increasing the power of identifying gene × gene interactions in genome‐wide association studies , 2008, Genetic epidemiology.

[54]  Li Ma,et al.  Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies , 2008, BMC Bioinformatics.

[55]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[56]  Daniel E. Weeks,et al.  Interpretation of Genetic Association Studies: Markers with Replicated Highly Significant Odds Ratios May Be Poor Classifiers , 2009, PLoS genetics.

[57]  Nick C Fox,et al.  Letter abstract - Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer's Disease , 2009 .

[58]  Nick C Fox,et al.  Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer's disease, and shows evidence for additional susceptibility genes , 2009, Nature Genetics.

[59]  Gary K. Chen,et al.  Fast and flexible simulation of DNA sequence data. , 2008, Genome research.

[60]  Taesung Park,et al.  New evaluation measures for multifactor dimensionality reduction classifiers in gene-gene interaction analysis , 2009, Bioinform..

[61]  Scott M. Williams,et al.  Epistasis and its implications for personal genetics. , 2009, American journal of human genetics.

[62]  P. Donnelly,et al.  Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip , 2009, PLoS genetics.

[63]  W. Oetting,et al.  Power of multifactor dimensionality reduction and penalized logistic regression for detecting gene-gene Interaction in a case-control study , 2009, BMC Medical Genetics.

[64]  Casey S. Greene,et al.  Failure to Replicate a Genetic Association May Provide Important Clues About Genetic Architecture , 2009, PloS one.

[65]  Tim Becker,et al.  INTERSNP: genome-wide interaction analysis guided by a priori information , 2009, Bioinform..

[66]  Qiang Yang,et al.  SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies , 2009, Bioinform..

[67]  Romdhane Rekaya,et al.  AntEpiSeeker: detecting epistatic interactions for case-control studies using a two-stage ant colony optimization algorithm , 2010, BMC Research Notes.

[68]  Yongmei Liu,et al.  A ground truth based comparative study on detecting epistatic SNPs , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop.

[69]  Extent and Consistency of Linkage Disequilibrium and Identification of DNA Markers for Production and Egg Quality Traits in Commercial Layer Chicken Populations , 2009 .

[70]  Ioannis Xenarios,et al.  FastEpistasis: a high performance computing solution for quantitative trait epistasis , 2010, Bioinform..

[71]  Qiang Yang,et al.  BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies , 2010, American journal of human genetics.

[72]  Scott M. Williams,et al.  challenges for genome-wide association studies , 2010 .

[73]  Momiao Xiong,et al.  A Novel Statistic for Genome-Wide Interaction Analysis , 2010, PLoS genetics.

[74]  S. Moore,et al.  Whole genome single nucleotide polymorphism associations with feed intake and feed efficiency in beef cattle. , 2010, Journal of animal science.

[75]  Jason H. Moore,et al.  BIOINFORMATICS REVIEW , 2005 .

[76]  Jason H. Moore,et al.  Multifactor dimensionality reduction for graphics processing units enables genome-wide testing of epistasis in sporadic ALS , 2010, Bioinform..

[77]  Lin He,et al.  SHEsisEpi, a GPU-enhanced genome-wide SNP-SNP interaction scanning algorithm, efficiently reveals the risk genetic epistasis in bipolar disorder , 2010, Cell Research.

[78]  Bo Peng,et al.  Forward-time simulation of realistic samples for genome-wide association studies , 2010, BMC Bioinformatics.

[79]  Long Cheng,et al.  Recurrent Neural Network for Non-Smooth Convex Optimization Problems With Application to the Identification of Genetic Regulatory Networks , 2011, IEEE Transactions on Neural Networks.

[80]  Divyakant Agrawal,et al.  eCEO: an efficient Cloud Epistasis cOmputing model in genome-wide association study , 2011, Bioinform..

[81]  Oscar E. Gaggiotti,et al.  Computer simulations: tools for population and evolutionary genetics , 2012, Nature Reviews Genetics.

[82]  Yue Wang,et al.  An Overview of Population Genetic Data Simulation , 2012, J. Comput. Biol..

[83]  Kristel Van Steen,et al.  Travelling the world of gene-gene interactions , 2012, Briefings Bioinform..

[84]  L. Letenneur,et al.  A genome-wide search for common SNP x SNP interactions on the risk of venous thrombosis , 2013, BMC Medical Genetics.

[85]  Q. Zou,et al.  Hierarchical Classification of Protein Folds Using a Novel Ensemble Classifier , 2013, PloS one.

[86]  Thomas W. Mühleisen,et al.  Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer's disease , 2013, Nature Genetics.

[87]  Julian Peto,et al.  A large-scale assessment of two-way SNP interactions in breast cancer susceptibility using 46,450 cases and 42,461 controls from the breast cancer association consortium. , 2014, Human molecular genetics.

[88]  Ke Chen,et al.  Survey of MapReduce frame operation in bioinformatics , 2013, Briefings Bioinform..

[89]  Li-Yeh Chuang,et al.  SNP-SNP Interaction Using Gauss Chaotic Map Particle Swarm Optimization to Detect Susceptibility to Breast Cancer , 2014, 2014 47th Hawaii International Conference on System Sciences.

[90]  L. Penrose,et al.  THE CORRELATION BETWEEN RELATIVES ON THE SUPPOSITION OF MENDELIAN INHERITANCE , 2022 .