Variable selection via penalized minimum φ-divergence estimation in logistic regression

We propose penalized minimum φ-divergence estimator for parameter estimation and variable selection in logistic regression. Using an appropriate penalty function, we show that penalized φ-divergence estimator has oracle property. With probability tending to 1, penalized φ-divergence estimator identifies the true model and estimates nonzero coefficients as efficiently as if the sparsity of the true model was known in advance. The advantage of penalized φ-divergence estimator is that it produces estimates of nonzero parameters efficiently than penalized maximum likelihood estimator when sample size is small and is equivalent to it for large one. Numerical simulations confirm our findings.

[1]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[2]  M. Pardo,et al.  Minimum Φ-Divergence Estimator and Φ-Divergence Statistics in Generalized Linear Models with Binary Data , 2008 .

[3]  Alan Agresti,et al.  Categorical Data Analysis , 2003 .

[4]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[5]  Runze Li,et al.  Tuning parameter selectors for the smoothly clipped absolute deviation method. , 2007, Biometrika.

[6]  Jian Huang,et al.  SCAD-penalized regression in high-dimensional partially linear models , 2009, 0903.5474.

[7]  Sophia Blau,et al.  Goodness Of Fit Statistics For Discrete Multivariate Data , 2016 .

[8]  C. Parr William,et al.  Minimum distance estimation:a bibliography , 1981 .

[9]  L. Pardo,et al.  Minimum power-divergence estimator in three-way contingency tables , 2003 .

[10]  Timothy R. C. Read,et al.  Multinomial goodness-of-fit tests , 1984 .

[11]  Yongdai Kim,et al.  Smoothly Clipped Absolute Deviation on High Dimensions , 2008 .

[12]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[13]  Leandro Pardo,et al.  Minimum phi divergence estimator and hierarchical testing in loglinear models , 2000 .

[14]  I. Vajda Theory of statistical inference and information , 1989 .

[15]  Leandro Pardo,et al.  A New Family of BAN Estimators for Polytomous Logistic Regression Models based on ϕ- Divergence Measures , 2006, Stat. Methods Appl..

[16]  L. Pardo Statistical Inference Based on Divergence Measures , 2005 .

[17]  Xiaodong Lin,et al.  Gene expression Gene selection using support vector machines with non-convex penalty , 2005 .

[18]  Erling B. Andersen,et al.  Introduction to the Statistical Analysis of Categorical Data , 1997 .

[19]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[20]  Leandro Pardo,et al.  Size and power considerations for testing loglinear models using divergence test statistics , 2003 .

[21]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[22]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[23]  Alvaro R. De Pierro,et al.  A modified expectation maximization algorithm for penalized likelihood estimation in emission tomography , 1995, IEEE Trans. Medical Imaging.

[24]  Leandro Pardo,et al.  Minimum Φ-divergence estimator in logistic regression models , 2006 .

[25]  J. Anderson,et al.  Penalized maximum likelihood estimation in logistic regression and discrimination , 1982 .

[26]  Jianqing Fan,et al.  Nonconcave penalized likelihood with a diverging number of parameters , 2004, math/0406466.

[27]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[28]  Volker Roth,et al.  The generalized LASSO , 2004, IEEE Transactions on Neural Networks.

[29]  S. Sathiya Keerthi,et al.  A simple and efficient algorithm for gene selection using sparse logistic regression , 2003, Bioinform..