Gene-Gene Interaction Tests Using SVM and Neural Network Modeling

Artificial neural networks (ANN) and support vector machine (SVM) modeling offer promise in the analysis of genotype-phenotype correlation in genetic association studies. In particular, we are interested in studying single nucleotide polymorphisms (SNPs) as genetic markers as predictors of a dichotomous disease outcome. The problem we are investigating is that of gene-gene and gene-environment interactions as determinants of the expression of complex diseases. This study builds on our previous work for a single gene testing procedure developed and presented earlier (Matchenko-Shimko and Dube, 2006). As for single SNPs pre-selection (Matchenko-Shimko and Dube, 2006), we rely on ANN sensitivity analysis algorithms to detect potential pairs of interacting SNPs associated with the disease outcome. The statistical test for SNP interaction is computed using a bootstrap technique and is based on the measure of the predictive significance of two SNPs from the change in the ANN error function (SVM regression error) when these two SNPs are removed from the ANN or SVM genotype-phenotype models. To investigate the power to detect and test gene-gene interactions we simulated genotypes including two interacting loci with low marginal effects, incomplete penetrance and phenocopies according to three different models of interaction

[1]  Gunnar Rätsch,et al.  Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites , 2000, German Conference on Bioinformatics.

[2]  J. Witte,et al.  Genetic dissection of complex traits , 1996, Nature Genetics.

[3]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[4]  Nelli Shimko,et al.  Bootstrap Inference with Neural-Network Modeling for Gene-Disease Association Testing , 2006, 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology.

[5]  N. Schork,et al.  Genetics of complex disease: approaches, problems, and solutions. , 1997, American journal of respiratory and critical care medicine.

[6]  C. Carlson,et al.  Mapping complex disease loci in whole-genome association studies , 2004, Nature.

[7]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[8]  Alfonso Palmer,et al.  Numeric sensitivity analysis applied to feedforward neural networks , 2003, Neural Computing & Applications.

[9]  N. Risch Searching for genetic determinants in the new millennium , 2000, Nature.

[10]  N. Schork,et al.  Single nucleotide polymorphisms and the future of genetic epidemiology , 2000, Clinical genetics.

[11]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[12]  N. Schork,et al.  Who's afraid of epistasis? , 1996, Nature Genetics.

[13]  Scott M. Williams,et al.  New strategies for identifying gene-gene interactions in hypertension , 2002, Annals of medicine.

[14]  E. Lander,et al.  Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease , 2003, Nature Genetics.

[15]  T. Reich,et al.  A perspective on epistasis: limits of models displaying no main effect. , 2002, American journal of human genetics.

[16]  Lon R. Cardon,et al.  The complex interplay among factors that influence allelic association , 2004, Nature Reviews Genetics.

[18]  Paula A. Kiberstis,et al.  It's Not Just the Genes , 2002, Science.

[19]  L. Palmer,et al.  Using single nucleotide polymorphisms as a means to understanding the pathophysiology of asthma , 2001, Respiratory research.

[20]  J. Witte,et al.  Genetic dissection of complex traits. , 1994, Nature genetics.

[21]  P. Donnelly,et al.  Genome-wide strategies for detecting multiple loci that influence complex diseases , 2005, Nature Genetics.

[22]  Scott M. Williams,et al.  The use of animal models in the study of complex disease: all else is never equal or why do so many human studies fail to replicate animal findings? , 2004, BioEssays : news and reviews in molecular, cellular and developmental biology.

[23]  Jacek M. Zurada,et al.  Sensitivity analysis for minimization of input data dimension for feedforward neural network , 1994, Proceedings of IEEE International Symposium on Circuits and Systems - ISCAS '94.

[24]  J. Ott,et al.  Mathematical multi-locus approaches to localizing complex human trait genes , 2003, Nature Reviews Genetics.

[25]  Federico Girosi,et al.  Support Vector Machines: Training and Applications , 1997 .

[26]  Bf Buxton,et al.  An introduction to support vector machines for data mining , 2001 .

[27]  N. Schork Genetically complex cardiovascular traits. Origins, problems, and potential solutions. , 1997, Hypertension.

[28]  Timothy Masters,et al.  Practical neural network recipes in C , 1993 .