Identifying Interacting Genetic Variations by Fish-Swarm Logic Regression

Understanding associations between genotypes and complex traits is a fundamental problem in human genetics. A major open problem in mapping phenotypes is that of identifying a set of interacting genetic variants, which might contribute to complex traits. Logic regression (LR) is a powerful multivariant association tool. Several LR-based approaches have been successfully applied to different datasets. However, these approaches are not adequate with regard to accuracy and efficiency. In this paper, we propose a new LR-based approach, called fish-swarm logic regression (FSLR), which improves the logic regression process by incorporating swarm optimization. In our approach, a school of fish agents are conducted in parallel. Each fish agent holds a regression model, while the school searches for better models through various preset behaviors. A swarm algorithm improves the accuracy and the efficiency by speeding up the convergence and preventing it from dropping into local optimums. We apply our approach on a real screening dataset and a series of simulation scenarios. Compared to three existing LR-based approaches, our approach outperforms them by having lower type I and type II error rates, being able to identify more preset causal sites, and performing at faster speeds.

[1]  Jing He,et al.  Gene-based interaction analysis by incorporating external linkage disequilibrium information , 2010, European Journal of Human Genetics.

[2]  J. Kennedy,et al.  Dopamine Genes and Pathological Gambling in Discordant Sib-Pairs , 2007, Journal of Gambling Studies.

[3]  Holly Janes,et al.  Identifying target populations for screening or not screening using logic regression , 2005, Statistics in medicine.

[4]  Aris Floratos,et al.  Pattern-based mining strategy to detect multi-locus association and gene × environment interaction , 2007, BMC proceedings.

[5]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[6]  W. Marsden I and J , 2012 .

[7]  Y. S. Zhu,et al.  Association study of polymorphisms in the promoter region of DRD4 with schizophrenia, depression, and heroin addiction , 2010, Brain Research.

[8]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[9]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[10]  Quan Long,et al.  Detecting disease-associated genotype patterns , 2009, BMC Bioinformatics.

[11]  Greg Gibson,et al.  Rare and common variants: twenty arguments , 2012, Nature Reviews Genetics.

[12]  Marylyn D. Ritchie,et al.  GPNN: Power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease , 2006, BMC Bioinformatics.

[13]  Holger Schwender,et al.  Testing SNPs and sets of SNPs for importance in association studies. , 2011, Biostatistics.

[14]  Marylyn D Ritchie,et al.  Comparison of approaches for machine‐learning optimization of neural networks for detecting gene‐gene interactions in genetic epidemiology , 2008, Genetic epidemiology.

[15]  Jin Zhang,et al.  Identifying interacting SNPs with parallel fish-agent based logic regression , 2011, 2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences (ICCABS).

[16]  H. Shin,et al.  5' UTR polymorphism of dopamine receptor D1 (DRD1) associated with severity and temperament of alcoholism. , 2007, Biochemical and biophysical research communications.

[17]  Ingo Ruczinski,et al.  Identifying interacting SNPs using Monte Carlo logic regression , 2005, Genetic epidemiology.

[18]  W. Maier,et al.  Association of specific haplotypes of D2 dopamine receptor gene with vulnerability to heroin dependence in 2 distinct populations. , 2004, Archives of general psychiatry.

[20]  Garrett Hellenthal,et al.  msHOT: modifying Hudson's ms simulator to incorporate crossover and gene conversion hotspots , 2007, Bioinform..

[21]  Li Xiao,et al.  An Optimizing Method Based on Autonomous Animats: Fish-swarm Algorithm , 2002 .

[22]  Shengbin Li,et al.  Potential association of DRD2 and DAT1 genetic variation with heroin dependence , 2009, Neuroscience Letters.

[23]  Yuying Xie,et al.  The effect of dopamine D2, D5 receptor and transporter (SLC6A3) polymorphisms on the cue‐elicited heroin craving in Chinese , 2006, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[24]  W. G. Hill,et al.  Heritability in the genomics era — concepts and misconceptions , 2008, Nature Reviews Genetics.

[25]  Adel Nadjaran Toosi,et al.  Artificial fish swarm algorithm: a survey of the state-of-the-art, hybridization, combinatorial and indicative applications , 2012, Artificial Intelligence Review.

[26]  M. LeBlanc,et al.  Logic Regression , 2003 .

[27]  Holger Schwender,et al.  Identification of SNP interactions using logic regression. , 2008, Biostatistics.

[28]  P. Gaszner,et al.  Combined effect of promoter polymorphisms in the dopamine D4 receptor and the serotonin transporter genes in heroin dependence. , 2005, Neuropsychopharmacologia Hungarica : a Magyar Pszichofarmakologiai Egyesulet lapja = official journal of the Hungarian Association of Psychopharmacology.

[29]  Katja Ickstadt,et al.  Comparing Logic Regression Based Methods for Identifying SNP Interactions , 2007, BIRD.

[30]  Ming D. Li,et al.  Differential Allelic Expression of Dopamine D1 Receptor Gene (DRD1) Is Modulated by microRNA miR-504 , 2009, Biological Psychiatry.

[31]  Ming D. Li,et al.  Significant association of DRD1 with nicotine dependence , 2008, Human Genetics.

[32]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.