Simulation-based Regularized Logistic Regression

In this paper, we develop a simulation-based framework for regu- larized logistic regression, exploiting two novel results for scale mixtures of nor- mals. By carefully choosing a hierarchical model for the likelihood by one type of mixture, and implementing regularization with another, we obtain new MCMC schemes with varying e-ciency depending on the data type (binary v. binomial, say) and the desired estimator (maximum likelihood, maximum a posteriori, poste- rior mean). Advantages of our omnibus approach include ∞exibility, computational e-ciency, applicability in p ? n settings, uncertainty estimates, variable selection, and assessing the optimal degree of regularization. We compare our methodology to modern alternatives on both synthetic and real data. An R package called reglogit is available on CRAN.

[1]  Martin Pincus,et al.  Letter to the Editor - -A Closed Form Solution of Certain Programming Problems , 1968, Oper. Res..

[2]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[3]  D. F. Andrews,et al.  Scale Mixtures of Normal Distributions , 1974 .

[4]  C. Mallows,et al.  A Method for Simulating Stable Random Variables , 1976 .

[5]  O. Barndorff-Nielsen,et al.  Normal Variance-Mean Mixtures and z Distributions , 1982 .

[6]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[7]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[8]  Nicholas G. Polson,et al.  Inference for nonconjugate Bayesian Models using the Gibbs sampler , 1991 .

[9]  Adrian F. M. Smith,et al.  Bayesian Inference for Generalized Linear and Proportional Hazards Models Via Gibbs Sampling , 1993 .

[10]  C. Robert Simulation of truncated normal variables , 2009, 0907.4010.

[11]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[12]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[13]  C. Mallows More comments on C p , 1995 .

[14]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[15]  R. Weron Correction to: "On the Chambers–Mallows–Stuck Method for Simulating Skewed Stable Random Variables" , 1996 .

[16]  M. Steel,et al.  On Bayesian Modelling of Fat Tails and Skewness , 1998 .

[17]  Dani Gamerman,et al.  Sampling from the posterior distribution in generalized linear mixed models , 1997, Stat. Comput..

[18]  D. K. Dey,et al.  BAYESIAN MODELING OF CORRELATED BINARY RESPONSES VIA SCALE MIXTURE OF MULTIVARIATE NORMAL LINK FUNCTIONS , 1998 .

[19]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo in Practice: A Roundtable Discussion , 1998 .

[20]  Jim Albert,et al.  Ordinal Data Modeling , 2000 .

[21]  Simon J. Godsill,et al.  Inference in symmetric alpha-stable noise using MCMC and the slice sampler , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[22]  Simon J. Godsill,et al.  Marginal maximum a posteriori estimation using Markov chain Monte Carlo , 2002, Stat. Comput..

[23]  D. Madigan Discussion of Least Angle Regression , 2003 .

[24]  P. Müller,et al.  Optimal Bayesian Design by Inhomogeneous Markov Chain Simulation , 2004 .

[25]  D. Madigan,et al.  [Least Angle Regression]: Discussion , 2004 .

[26]  D. Dunson,et al.  Bayesian Multivariate Logistic Regression , 2004, Biometrics.

[27]  J. Berger,et al.  Optimal predictive model selection , 2004, math/0406464.

[28]  B. Turlach Discussion of "Least Angle Regression" by Efron, Hastie, Johnstone and Tibshirani , 2004 .

[29]  Lawrence Carin,et al.  Sparse multinomial logistic regression: fast algorithms and generalization bounds , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  C. Holmes,et al.  Bayesian auxiliary variable models for binary and multinomial regression , 2006 .

[31]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[32]  Sylvia Frühwirth-Schnatter,et al.  Auxiliary mixture sampling with applications to logistic models , 2007, Comput. Stat. Data Anal..

[33]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[34]  Nicholas G. Polson,et al.  MCMC maximum likelihood for latent state models , 2007 .

[35]  A. Pettitt,et al.  Marginal likelihood estimation via power posteriors , 2008 .

[36]  Miguel A. Gómez-Villegas,et al.  Multivariate Exponential Power Distributions as Mixtures of Normal Distributions with Bayesian Applications , 2008 .

[37]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[38]  R. Tüchler Bayesian Variable Selection for Logistic Models Using Auxiliary Mixture Sampling , 2008 .

[39]  Mee Young Park,et al.  Penalized logistic regression for detecting gene interactions. , 2008, Biostatistics.

[40]  J. Horowitz,et al.  Asymptotic properties of bridge estimators in sparse high-dimensional regression models , 2008, 0804.0693.

[41]  Chris Hans Bayesian lasso regression , 2009 .

[42]  Robert B. Gramacy,et al.  Shrinkage regression for multivariate inference with missing data, and an application to portfolio balancing , 2009, 0907.2135.

[43]  Leonhard Held,et al.  Improved auxiliary mixture sampling for hierarchical models of non-Gaussian data , 2009, Stat. Comput..

[44]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[45]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[46]  Gerhard Tutz,et al.  Statistical modelling and regression structures : festschrift in honour of Ludwig Fahrmeir , 2010 .

[47]  Ludwig Fahrmeir,et al.  Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection , 2010, Stat. Comput..

[48]  J. Griffin,et al.  Inference with normal-gamma prior distributions in regression problems , 2010 .

[49]  S. Frühwirth-Schnatter,et al.  Data Augmentation and MCMC for Binary and Multinomial Logit Models , 2010 .

[50]  S. L. Scott Data augmentation, frequentist estimation, and the Bayesian analysis of multinomial logit models , 2011 .