Reversible jump Markov chain Monte Carlo algorithms for Bayesian variable selection in logistic mixed models

ABSTRACT In this article, to reduce computational load in performing Bayesian variable selection, we used a variant of reversible jump Markov chain Monte Carlo methods, and the Holmes and Held (HH) algorithm, to sample model index variables in logistic mixed models involving a large number of explanatory variables. Furthermore, we proposed a simple proposal distribution for model index variables, and used a simulation study and real example to compare the performance of the HH algorithm with our proposed and existing proposal distributions. The results show that the HH algorithm with our proposed proposal distribution is a computationally efficient and reliable selection method.

[1]  D. Dunson,et al.  Random Effects Selection in Linear Mixed Models , 2003, Biometrics.

[2]  G. Casella,et al.  Penalized regression, standard errors, and Bayesian lassos , 2010 .

[3]  Jim E. Griffin,et al.  Transdimensional Sampling Algorithms for Bayesian Variable Selection in Classification Problems With Many More Variables Than Observations , 2009 .

[4]  Guifang Fu,et al.  The Bayesian lasso for genome-wide association studies , 2011, Bioinform..

[5]  R. Kohn,et al.  Nonparametric regression using Bayesian variable selection , 1996 .

[6]  G. Roberts,et al.  Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions , 2003 .

[7]  James G. Scott,et al.  Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables , 2012, 1205.0310.

[8]  B. Fridley Bayesian variable and model selection methods for genetic association studies , 2009, Genetic epidemiology.

[9]  Jonathan J. Forster,et al.  Default Bayesian model determination methods for generalised linear mixed models , 2010, Comput. Stat. Data Anal..

[10]  C. Holmes,et al.  Bayesian auxiliary variable models for binary and multinomial regression , 2006 .

[11]  Meïli C. Baragatti,et al.  A study of variable selection using g-prior distribution with ridge parameter , 2011, Comput. Stat. Data Anal..

[12]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[13]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo in Practice: A Roundtable Discussion , 1998 .

[14]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[15]  Rameswar Debnath,et al.  A Comparison of SVM-based Criteria in Evolutionary Method for Gene Selection and Classification of Microarray Data , 2010 .

[16]  J. Berger,et al.  Optimal predictive model selection , 2004, math/0406464.

[17]  B. Mallick,et al.  Fast sampling with Gaussian scale-mixture priors in high-dimensional regression. , 2015, Biometrika.

[18]  P. Gustafson,et al.  Conservative prior distributions for variance parameters in hierarchical models , 2006 .

[19]  Miao-Yu Tsai,et al.  Variable selection in Bayesian generalized linear‐mixed models: An illustration using candidate gene case‐control association studies , 2015, Biometrical journal. Biometrische Zeitschrift.

[20]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[21]  T. Fearn,et al.  Bayesian wavelength selection in multicomponent analysis , 1998 .

[22]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[23]  Satkartar K. Kinney,et al.  Fixed and Random Effects Selection in Linear and Logistic Models , 2007, Biometrics.

[24]  J. Rosenthal,et al.  Optimal scaling for various Metropolis-Hastings algorithms , 2001 .

[25]  M. West On scale mixtures of normal distributions , 1987 .

[26]  Eduardo Ley,et al.  On the Effect of Prior Assumptions in Bayesian Model Averaging With Applications to Growth Regression , 2007 .

[27]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[28]  A. Zellner,et al.  Posterior odds ratios for selected regression hypotheses , 1980 .

[29]  Joseph G. Ibrahim,et al.  Variable Selection in Regression Mixture Modeling for the Discovery of Gene Regulatory Networks , 2007 .

[30]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[31]  Philip Heidelberger,et al.  Simulation Run Length Control in the Presence of an Initial Transient , 1983, Oper. Res..

[32]  Edward E. Leamer,et al.  Specification Searches: Ad Hoc Inference with Nonexperimental Data , 1980 .

[33]  R. Lewontin The Interaction of Selection and Linkage. I. General Considerations; Heterotic Models. , 1964, Genetics.

[34]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .