Penalized logistic regression for detecting gene interactions.

We propose using a variant of logistic regression (LR) with (L)_(2)-regularization to fit gene-gene and gene-environment interaction models. Studies have shown that many common diseases are influenced by interaction of certain genes. LR models with quadratic penalization not only correctly characterizes the influential genes along with their interaction structures but also yields additional benefits in handling high-dimensional, discrete factors with a binary response. We illustrate the advantages of using an (L)_(2)-regularization scheme and compare its performance with that of "multifactor dimensionality reduction" and "FlexTree," 2 recent tools for identifying gene-gene interactions. Through simulated and real data sets, we demonstrate that our method outperforms other methods in the identification of the interaction structures as well as prediction accuracy. In addition, we validate the significance of the factors selected through bootstrap analyses.

[1]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[2]  M. Silvapulle,et al.  Ridge estimation in logistic regression , 1988 .

[3]  N. Risch Linkage strategies for genetically complex traits. I. Multilocus models. , 1990, American journal of human genetics.

[4]  N. Risch Linkage strategies for genetically complex traits. II. The power of affected relative pairs. , 1990, American journal of human genetics.

[5]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[6]  Robert Gray,et al.  Flexible Methods for Analyzing Survival Data Using Splines, with Applications to Breast Cancer Prognosis , 1992 .

[7]  J. Rice,et al.  Two‐Locus models of disease , 1992, Genetic epidemiology.

[8]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[9]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[10]  Robert Tibshirani,et al.  An Introduction to the Bootstrap CHAPMAN & HALL/CRC , 1993 .

[11]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[12]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[13]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[14]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[15]  Jason H. Moore,et al.  An application of conditional logistic regression and multifactor dimensionality reduction for detecting gene-gene Interactions on risk of myocardial infarction: The importance of model validation , 2004, BMC Bioinformatics.

[16]  V. Vieland,et al.  Two-locus heterogeneity cannot be distinguished from two-locus epistasis on the basis of affected-sib-pair data. , 2003, American journal of human genetics.

[17]  Jason H. Moore,et al.  Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions , 2003, Bioinform..

[18]  Jason H. Moore,et al.  Power of multifactor dimensionality reduction for detecting gene‐gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity , 2003, Genetic epidemiology.

[19]  T. Hastie,et al.  Classification of gene microarrays by penalized logistic regression. , 2004, Biostatistics.

[20]  Low-Tone Ho,et al.  Tree-structured supervised learning and the genetics of hypertension. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[21]  John S Witte,et al.  Using hierarchical modeling in genetic association studies with multiple markers: application to a case-control study of bladder cancer. , 2004, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[22]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[23]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .