Bayesian Methods for High Dimensional Linear Models.

In this article, we present a selective overview of some recent developments in Bayesian model and variable selection methods for high dimensional linear models. While most of the reviews in literature are based on conventional methods, we focus on recently developed methods, which have proven to be successful in dealing with high dimensional variable selection. First, we give a brief overview of the traditional model selection methods (viz. Mallow's Cp, AIC, BIC, DIC), followed by a discussion on some recently developed methods (viz. EBIC, regularization), which have occupied the minds of many statisticians. Then, we review high dimensional Bayesian methods with a particular emphasis on Bayesian regularization methods, which have been used extensively in recent years. We conclude by briefly addressing the asymptotic behaviors of Bayesian variable selection methods for high dimensional linear models under different regularity conditions.

[1]  J. Griffin,et al.  Inference with normal-gamma prior distributions in regression problems , 2010 .

[2]  B. Carlin,et al.  Bayesian Model Choice Via Markov Chain Monte Carlo Methods , 1995 .

[3]  J. Ames,et al.  Variable Inclusion and Shrinkage Algorithms , 2008 .

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  G. Casella,et al.  Objective Bayesian Variable Selection , 2006 .

[6]  Qing Li,et al.  The Bayesian elastic net , 2010 .

[7]  M. Woodroofe On Model Selection and the ARC Sine Laws , 1982 .

[8]  Douglas C. Montgomery,et al.  The Generalized Linear Model , 2012 .

[9]  Richard J. Cook,et al.  Generalized Linear Model , 2014 .

[10]  Nengjun Yi,et al.  Hierarchical Generalized Linear Models for Multiple Quantitative Trait Locus Mapping , 2009, Genetics.

[11]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[12]  Sylvia Richardson,et al.  Evolutionary Stochastic Search for Bayesian model exploration , 2010, 1002.2706.

[13]  J. Griffin,et al.  Alternative prior distributions for variable selection with very many more variables than observations , 2005 .

[14]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[15]  Cheolwoo Park,et al.  Bridge regression: Adaptivity and group selection , 2011 .

[16]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[17]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[18]  Petros Dellaportas,et al.  On Bayesian model and variable selection using MCMC , 2002, Stat. Comput..

[19]  Jianqing Fan,et al.  Sure independence screening in generalized linear models with NP-dimensionality , 2009, The Annals of Statistics.

[20]  Samiran Ghosh,et al.  On the grouped selection and model complexity of the adaptive elastic net , 2011, Stat. Comput..

[21]  Karl W. Broman,et al.  A model selection approach for the identification of quantitative trait loci in experimental crosses , 2002 .

[22]  G. Wahba Smoothing noisy data with spline functions , 1975 .

[23]  James G. Scott,et al.  Good, great, or lucky? Screening for firms with sustained superior performance using heavy-tailed priors , 2010, 1010.5223.

[24]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[25]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[26]  N. Yi,et al.  Bayesian LASSO for Quantitative Trait Loci Mapping , 2008, Genetics.

[27]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[28]  Hosik Choi,et al.  Consistent Model Selection Criteria on High Dimensions , 2012, J. Mach. Learn. Res..

[29]  David B Dunson,et al.  Bayesian nonparametric hierarchical modeling. , 2009, Biometrical journal. Biometrische Zeitschrift.

[30]  R. O’Hara,et al.  A review of Bayesian variable selection methods: what, how and which , 2009 .

[31]  Yuhong Yang REGRESSION WITH MULTIPLE CANDIDATE MODELS: SELECTING OR MIXING? , 1999 .

[32]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[33]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[34]  Y. Luan,et al.  On Model Selection Consistency of Bayesian Method for Normal Linear Models , 2011 .

[35]  M. Clyde,et al.  Mixtures of g Priors for Bayesian Variable Selection , 2008 .

[36]  Wenxin Jiang Bayesian variable selection for high dimensional generalized linear models : Convergence rates of the fitted densities , 2007, 0710.3458.

[37]  Xiaohui Chen,et al.  A Bayesian Lasso via reversible-jump MCMC , 2011, Signal Process..

[38]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.

[39]  Wasserman,et al.  Bayesian Model Selection and Model Averaging. , 2000, Journal of mathematical psychology.

[40]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[41]  K. Lange,et al.  Coordinate descent algorithms for lasso penalized regression , 2008, 0803.3876.

[42]  V. Johnson,et al.  On the use of non‐local prior densities in Bayesian hypothesis tests , 2010 .

[43]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[44]  Aki Vehtari,et al.  Understanding predictive information criteria for Bayesian models , 2013, Statistics and Computing.

[45]  M. West On scale mixtures of normal distributions , 1987 .

[46]  H. Akaike A new look at the statistical model identification , 1974 .

[47]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[48]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[49]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[50]  P. Green On Use of the EM Algorithm for Penalized Likelihood Estimation , 1990 .

[51]  Larry Wasserman,et al.  Asymptotic Properties of Nonparametric Bayesian Procedures , 1998 .

[52]  Wenxin Jiang On the Consistency of Bayesian Variable Selection for High Dimensional Binary Regression and Classification , 2006, Neural Computation.

[53]  A. U.S.,et al.  Posterior consistency in linear models under shrinkage priors , 2013 .

[54]  A. Lijoi,et al.  Models Beyond the Dirichlet Process , 2009 .

[55]  Zehua Chen,et al.  EXTENDED BIC FOR SMALL-n-LARGE-P SPARSE GLM , 2012 .

[56]  T. Choi,et al.  Gaussian Process Regression Analysis for Functional Data , 2011 .

[57]  Jianqing Fan,et al.  A Selective Overview of Variable Selection in High Dimensional Feature Space. , 2009, Statistica Sinica.

[58]  Cun-Hui Zhang,et al.  A group bridge approach for variable selection , 2009, Biometrika.

[59]  Xin Yan,et al.  Linear Regression Analysis: Theory and Computing , 2009 .

[60]  Nengjun Yi,et al.  Hierarchical Shrinkage Priors and Model Fitting for High-dimensional Generalized Linear Models , 2012, Statistical applications in genetics and molecular biology.

[61]  Chris Hans,et al.  Model uncertainty and variable selection in Bayesian lasso regression , 2010, Stat. Comput..

[62]  Zehua Chen,et al.  Extended BIC for linear regression models with diverging number of relevant features and high or ultra-high feature spaces , 2011 .

[63]  David R. Anderson,et al.  Multimodel Inference , 2004 .

[64]  Yongdai Kim,et al.  Smoothly Clipped Absolute Deviation on High Dimensions , 2008 .

[65]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[66]  Mikko J Sillanpää,et al.  Genetic analysis of complex traits via Bayesian variable selection: the utility of a mixture of uniform priors. , 2011, Genetics research.

[67]  G. Casella,et al.  Consistency of Bayesian procedures for variable selection , 2009, 0904.2978.

[68]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[69]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[70]  Michael I. Jordan,et al.  Bayesian Nonparametrics: Hierarchical Bayesian nonparametric models with applications , 2010 .

[71]  G. Casella,et al.  CONSISTENCY OF OBJECTIVE BAYES FACTORS AS THE MODEL DIMENSION GROWS , 2010, 1010.3821.

[72]  Ina Hoeschele,et al.  Nonparametric Bayesian Variable Selection With Applications to Multiple Quantitative Trait Loci Mapping With Epistasis and Gene–Environment Interaction , 2010, Genetics.

[73]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[74]  Chris Hans Elastic Net Regression Modeling With the Orthant Normal Prior , 2011 .

[75]  Jaeyong Lee,et al.  GENERALIZED DOUBLE PARETO SHRINKAGE. , 2011, Statistica Sinica.

[76]  T. Ando Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models , 2007 .

[77]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[78]  Jiahua Chen,et al.  Extended Bayesian information criteria for model selection with large model spaces , 2008 .

[79]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[80]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[81]  Debasis Kundu,et al.  Model selection in linear regression , 1996 .

[82]  A. Raftery Bayesian Model Selection in Social Research , 1995 .

[83]  S. Geer,et al.  Regularization in statistics , 2006 .

[84]  Shizhong Xu,et al.  An expectation–maximization algorithm for the Lasso estimation of quantitative trait locus effects , 2010, Heredity.

[85]  J. Berger,et al.  The Intrinsic Bayes Factor for Model Selection and Prediction , 1996 .

[86]  Chenlei Leng,et al.  Shrinkage tuning parameter selection with a diverging number of parameters , 2008 .

[87]  T. Hesterberg,et al.  Least angle and ℓ1 penalized regression: A review , 2008, 0802.0964.

[88]  Lin S. Chen,et al.  Insights into colon cancer etiology via a regularized approach to gene set analysis of GWAS data. , 2010, American journal of human genetics.

[89]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[90]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[91]  Edward I. George,et al.  The Practical Implementation of Bayesian Model Selection , 2001 .

[92]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[93]  James G. Scott,et al.  The Bayesian bridge , 2011, 1109.2279.

[94]  David B. Dunson,et al.  Bayesian Nonparametrics: Nonparametric Bayes applications to biostatistics , 2010 .

[95]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[96]  Yuhong Yang,et al.  An Asymptotic Property of Model Selection Criteria , 1998, IEEE Trans. Inf. Theory.

[97]  V. Johnson,et al.  Bayesian Model Selection in High-Dimensional Settings , 2012, Journal of the American Statistical Association.

[98]  R. Nishii Asymptotic Properties of Criteria for Selection of Variables in Multiple Regression , 1984 .

[99]  N. Pillai,et al.  Bayesian shrinkage , 2012, 1212.6088.

[100]  Michael I. Jordan,et al.  Hierarchical Bayesian Nonparametric Models with Applications , 2008 .

[101]  A. O'Hagan,et al.  Fractional Bayes factors for model comparison , 1995 .

[102]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[103]  G. Casella,et al.  Penalized regression, standard errors, and Bayesian lassos , 2010 .

[104]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[105]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[106]  F. Liang,et al.  Bayesian Subset Modeling for High-Dimensional Generalized Linear Models , 2013 .

[107]  J. Shao AN ASYMPTOTIC THEORY FOR LINEAR MODEL SELECTION , 1997 .