GPNN: Power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease

Background: The identification and characterization of genes that influence the risk of common, complex multifactorial disease primarily through interactions with other genes and environmental factors remains a statistical and computational challenge in genetic epidemiology. We have previously introduced a genetic programming optimized neural network (GPNN) as a method for optimizing the architecture of a neural network to improve the identification of gene combinations associated with disease risk. The goal of this study was to evaluate the power of GPNN for identifying high-order gene-gene interactions. We were also interested in applying GPNN to a real data analysis in Parkinson's disease. Results: We show that GPNN has high power to detect even relatively small genetic effects (2-3% heritability) in simulated data models involving two and three locus interactions. The limits of detection were reached under conditions with very small heritability (<1%) or when interactions involved more than three loci. We tested GPNN on a real dataset comprised of Parkinson's disease cases and controls and found a two locus interaction between the DLST gene and sex. Conclusion: These results indicate that GPNN may be a useful pattern recognition approach for detecting gene-gene and gene-environment interactions.

[1]  R. Nussbaum,et al.  Hereditary Early-Onset Parkinson's Disease Caused by Mutations in PINK1 , 2004, Science.

[2]  A. Mclean,et al.  Age-Environment and Gene-Environment Interactions in the Pathogenesis of Parkinson's Disease , 2002, Reviews on environmental health.

[3]  J. Concato,et al.  A simulation study of the number of events per variable in logistic regression analysis. , 1996, Journal of clinical epidemiology.

[4]  A. Hofman,et al.  Prevalence of Parkinson's disease in Europe: A collaborative study of population-based cohorts. Neurologic Diseases in the Elderly Research Group. , 2000, Neurology.

[5]  P. A. Silburn,et al.  A novel screen for nuclear mitochondrial gene associations with Parkinson’s disease , 2004, Journal of Neural Transmission.

[6]  Jason H. Moore,et al.  The Ubiquitous Nature of Epistasis in Determining Susceptibility to Common Human Diseases , 2003, Human Heredity.

[7]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[8]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[9]  Jason H. Moore,et al.  Application Of Genetic Algorithms To The Discovery Of Complex Models For Simulation Studies In Human Genetics , 2002, GECCO.

[10]  Jason H. Moore,et al.  Symbolic discriminant analysis of microarray data in autoimmune disease , 2002, Genetic epidemiology.

[11]  P Mutanen,et al.  Genetic epidemiology of multistage carcinogenesis. , 2001, Mutation research.

[12]  J. Concato,et al.  The Risk of Determining Risk with Multivariable Models , 1993, Annals of Internal Medicine.

[13]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[14]  J. Haines,et al.  Mitochondrial polymorphisms significantly reduce the risk of Parkinson disease. , 2003, American journal of human genetics.

[15]  Scott M. Williams,et al.  New strategies for identifying gene-gene interactions in hypertension , 2002, Annals of medicine.

[16]  S. Kardia,et al.  Context-dependent genetic effects in hypertension , 2000, Current hypertension reports.

[17]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[18]  Jason H. Moore,et al.  Cross Validation Consistency for the Assessment of Genetic Programming Results in Microarray Studies , 2003, EvoWorkshops.

[19]  J. Ashford,et al.  Non-familial Alzheimer's disease is mainly due to genetic factors. , 2002, Journal of Alzheimer's disease : JAD.

[20]  John R. Koza,et al.  Genetic generation of both the weights and architecture for a neural network , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[21]  Bill C White,et al.  Optimization of neural network architecture using genetic programming improves detection and modeling of gene-gene interactions in studies of human diseases , 2003, BMC Bioinformatics.

[22]  Stanley Lemeshow,et al.  Applied Logistic Regression, Second Edition , 1989 .

[23]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[24]  J. Utans,et al.  Selecting neural network architectures via the prediction risk: application to corporate bond rating prediction , 1991, Proceedings First International Conference on Artificial Intelligence Applications on Wall Street.

[25]  G. Wooten,et al.  Are men at greater risk for Parkinson’s disease than women? , 2004, Journal of Neurology, Neurosurgery & Psychiatry.