A Hybrid Approach to Selecting Susceptible Single Nucleotide Polymorphisms for Complex Disease Analysis

An increasingly popular and promising way for complex disease diagnosis is to employ artificial neural networks (ANN). Single nucleotide polymorphisms (SNP) data from individuals is used as the inputs of ANN to find out specific SNP patterns related to certain disease. Due to the large number of SNPs, it is crucial to select optimal SNP subset and their combinations so that the inputs of ANN can be reduced. With this observation in mind, a hybrid approach - a combination of genetic algorithms (GA) and ANN (called GANN) is used to automatically determine optimal SNP set and optimize the structure of ANN. The proposed GANN algorithm is evaluated by using both a synthetic dataset and a real SNP dataset of a complex disease.

[1]  J. Haines,et al.  Cigarette smoking strongly modifies the association of LOC387715 and age-related macular degeneration. , 2006, American journal of human genetics.

[2]  J. Ott,et al.  Neural networks and disease association studies. , 2001, American journal of medical genetics.

[3]  G. Satten,et al.  Inference on haplotype effects in case-control studies using unphased genotype data. , 2003, American journal of human genetics.

[4]  Hiroyuki Honda,et al.  Artificial neural network approach for selection of susceptible single nucleotide polymorphisms and construction of prediction model on childhood allergic asthma , 2004, BMC Bioinformatics.

[5]  Andrew Kusiak,et al.  Data mining and genetic algorithm based gene/SNP selection , 2004, Artif. Intell. Medicine.

[6]  J. Ott,et al.  Complement Factor H Polymorphism in Age-Related Macular Degeneration , 2005, Science.

[7]  Momiao Xiong,et al.  Nonlinear Tests for Genomewide Association Studies , 2006, Genetics.

[8]  Zili Zhang,et al.  Hybrid Methods to Select Informative Gene Sets in Microarray Data Classification , 2007, Australian Conference on Artificial Intelligence.

[9]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[10]  Kalyanmoy Deb,et al.  A Comparative Analysis of Selection Schemes Used in Genetic Algorithms , 1990, FOGA.

[11]  J. Gilbert,et al.  Complement Factor H Variant Increases the Risk of Age-Related Macular Degeneration , 2005, Science.

[12]  J. Ott,et al.  Trimming, weighting, and grouping SNPs in human case-control association studies. , 2001, Genome research.

[13]  Bill Batty,et al.  Data mining with neural networks—an applied example in understanding electricity consumption patterns , 1999, KDD 1999.

[14]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[15]  Ed Keedwell,et al.  Genetic Algorithms for Gene Expression Analysis , 2003, EvoWorkshops.