Bayesian multinomial regression with class-specific predictor selection

Consider a multinomial regression model where the response, which indicates a unit's membership in one of several possible unordered classes, is associated with a set of predictor variables. Such models typically involve a matrix of regression coefficients, with the $(j,k)$ element of this matrix modulating the effect of the $k$th predictor on the propensity of the unit to belong to the $j$th class. Thus, a supposition that only a subset of the available predictors are associated with the response corresponds to some of the columns of the coefficient matrix being zero. Under the Bayesian paradigm, the subset of predictors which are associated with the response can be treated as an unknown parameter, leading to typical Bayesian model selection and model averaging procedures. As an alternative, we investigate model selection and averaging, whereby a subset of individual elements of the coefficient matrix are zero. That is, the subset of predictors associated with the propensity to belong to a class varies with the class. We refer to this as class-specific predictor selection. We argue that such a scheme can be attractive on both conceptual and computational grounds.

[1]  Michael Keane,et al.  A Note on Identification in the Multinomial Probit Model , 1992 .

[2]  C. Holmes,et al.  Bayesian auxiliary variable models for binary and multinomial regression , 2006 .

[3]  D. S. Bunch,et al.  Estimability in the Multinomial Probit Model , 1989 .

[4]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[5]  R. Kohn,et al.  Nonparametric regression using Bayesian variable selection , 1996 .

[6]  M. Vannucci,et al.  Bayesian Variable Selection in Clustering High-Dimensional Data , 2005 .

[7]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[8]  Robert Kohn,et al.  Bayesian Variable Selection and Model Averaging in High-Dimensional Multinomial Nonparametric Regression , 2003 .

[9]  Nando de Freitas,et al.  A Constrained Semi-supervised Learning Approach to Data Association , 2004, ECCV.

[10]  M. Weeks The Multinomial Probit Model Revisited: A Discussion of Parameter Estimability, Identification and Specification Testing , 1997 .

[11]  Jun S. Liu,et al.  Bayesian Clustering with Variable and Transformation Selections , 2003 .

[12]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Marina Vannucci,et al.  Variable selection in clustering via Dirichlet process mixture models , 2006 .

[14]  E. Dougherty,et al.  Gene-expression profiles in hereditary breast cancer. , 2001, The New England journal of medicine.

[15]  Marina Vannucci,et al.  Bayesian Variable Selection in Multinomial Probit Models to Identify Molecular Signatures of Disease Stage , 2004, Biometrics.

[16]  T. Fearn,et al.  Multivariate Bayesian variable selection and prediction , 1998 .

[17]  J. Friedman,et al.  Clustering objects on subsets of attributes (with discussion) , 2004 .

[18]  Peter D. Hoff,et al.  Model-based subspace clustering , 2006 .

[19]  M. West,et al.  Shotgun Stochastic Search for “Large p” Regression , 2007 .

[20]  J. Sudbø,et al.  Gene-expression profiles in hereditary breast cancer. , 2001, The New England journal of medicine.

[21]  Adrian E. Raftery,et al.  Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data , 2005, Bioinform..

[22]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[23]  M. Steel,et al.  Benchmark Priors for Bayesian Model Averaging , 2001 .

[24]  N. D. Freitas,et al.  Statistics Technical Report # 231 Bayesian Variable Selection for Semi-Supervised Learning , with Application to Object Recognition , 2007 .

[25]  E R Dougherty,et al.  Multi-class cancer classification using multinomial probit regression with Bayesian gene selection. , 2006, Systems biology.

[26]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[27]  J. Berger,et al.  Optimal predictive model selection , 2004, math/0406464.