A Fully Nonparametric Modeling Approach to Binary Regression

© 2015 International Society for Bayesian Analysis. We propose a general nonparametric Bayesian framework for binary regression, which is built from modeling for the joint response-covariate distribution. The observed binary responses are assumed to arise from underlying continuous random variables through discretization, and we model the joint distribution of these latent responses and the covariates using a Dirichlet process mixture of multivariate normals. We show that the kernel of the induced mixture model for the observed data is identifiable upon a restriction on the latent variables. To allow for appropriate dependence structure while facilitating identifiability, we use a square-root-free Cholesky decomposition of the covariance matrix in the normal mixture kernel. In addition to allowing for the necessary restriction, this modeling strategy provides substantial simplifications in implementation of Markov chain Monte Carlo posterior simulation. We present two data examples taken from areas for which the methodology is especially well suited. In particular, the first example involves estimation of relationships between environmental variables, and the second develops inference for natural selection surfaces in evolutionary biology. Finally, we discuss extensions to regression settings with ordinal responses.

[1]  M. L. Eaton Multivariate statistics : a vector space approach , 1985 .

[2]  A. Gelfand,et al.  Dirichlet Process Mixed Generalized Linear Models , 1997 .

[3]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[4]  A. Kottas,et al.  A Bayesian Nonparametric Approach to Inference for Quantile Regression , 2010 .

[5]  Stephen G. Walker,et al.  Slice sampling mixture models , 2011, Stat. Comput..

[6]  Alan E. Gelfand,et al.  A Computational Approach for Full Nonparametric Bayesian Inference Under Dirichlet Process Mixture Models , 2002 .

[7]  Dolph Schluter,et al.  ESTIMATING THE FORM OF NATURAL SELECTION ON A QUANTITATIVE TRAIT , 1988, Evolution; international journal of organic evolution.

[8]  Dolph Schluter,et al.  NATURAL SELECTION ON BEAK AND BODY SIZE IN THE SONG SPARROW , 1986, Evolution; international journal of organic evolution.

[9]  Alan E. Gelfand,et al.  Model choice: A minimum posterior predictive loss approach , 1998, AISTATS.

[10]  S. MacEachern,et al.  Bayesian Nonparametric Spatial Modeling With Dirichlet Process Mixing , 2005 .

[11]  Peter Müller,et al.  A Simple Class of Bayesian Nonparametric Autoregression Models. , 2013, Bayesian analysis.

[12]  Babak Shahbaba,et al.  Nonlinear Models Using Dirichlet Process Mixtures , 2007, J. Mach. Learn. Res..

[13]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[14]  Refik Soyer,et al.  Bayesian Methods for Nonlinear Classification and Regression , 2004, Technometrics.

[15]  Antonio Canale,et al.  Bayesian Kernel Mixtures for Counts , 2011, Journal of the American Statistical Association.

[16]  J. E. Griffin,et al.  Order-Based Dependent Dirichlet Processes , 2006 .

[17]  Warren B. Powell,et al.  Dirichlet Process Mixtures of Generalized Linear Models , 2009, J. Mach. Learn. Res..

[18]  P. Müller,et al.  Nonparametric Bayesian Modeling for Multivariate Ordinal Data , 2005 .

[19]  Robert J. Connor,et al.  Concepts of Independence for Proportions with a Generalization of the Dirichlet Distribution , 1969 .

[20]  A. Kottas,et al.  Mixture Modeling for Marked Poisson Processes , 2010, 1012.2105.

[21]  F. H. C. Marriott,et al.  Multivariate Statistics: A Vector Space Approach , 1984 .

[22]  Bani K. Mallick,et al.  Hierarchical Generalized Linear Models and Frailty Models with Bayesian Nonparametric Mixing , 1997 .

[23]  G. Casella,et al.  The Effect of Improper Priors on Gibbs Sampling in Hierarchical Linear Mixed Models , 1996 .

[24]  P. Müller,et al.  Random Partition Models with Regression on Covariates. , 2010, Journal of statistical planning and inference.

[25]  H. Ishwaran,et al.  Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models , 2000 .

[26]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[27]  Matt Taddy Autoregressive Mixture Models for Dynamic Spatial Poisson Processes: Application to Tracking Intensity of Violent Crime , 2010 .

[28]  ShahbabaBabak,et al.  Nonlinear Models Using Dirichlet Process Mixtures , 2009 .

[29]  S. MacEachern,et al.  An ANOVA Model for Dependent Random Measures , 2004 .

[30]  David B. Dunson,et al.  Nonparametric Bayes regression and classification through mixtures of product kernels , 2010 .

[31]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[32]  H. Ishwaran,et al.  Exact and approximate sum representations for the Dirichlet process , 2002 .

[33]  L. Trippa,et al.  Bayesian nonparametric binary regression via random tessellations , 2009 .

[34]  Sally Wood,et al.  A Bayesian Approach to Robust Binary Nonparametric Regression , 1998 .

[35]  V. Chernozhukov,et al.  Bayesian Econometrics , 2007 .

[36]  Peter E. Rossi,et al.  A Bayesian analysis of the multinomial probit model with fully identified parameters , 2000 .

[37]  David B Dunson,et al.  Nonparametric Bayesian models through probit stick-breaking processes. , 2011, Bayesian analysis.

[38]  M. Newton,et al.  Bayesian Inference for Semiparametric Binary Regression , 1996 .

[39]  D. Dunson,et al.  BAYESIAN GENERALIZED PRODUCT PARTITION MODEL , 2010 .

[40]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[41]  H. Stern,et al.  LOGISTIC REGRESSION FOR EMPIRICAL STUDIES OF MULTIVARIATE SELECTION , 1998, Evolution; international journal of organic evolution.

[42]  Riten Mitra,et al.  Bayesian Nonparametric Inference - Why and How. , 2013, Bayesian analysis.

[43]  S. Ghosal,et al.  Nonparametric binary regression using a Gaussian process prior , 2007 .

[44]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[45]  Stephen G. Walker,et al.  A Bayesian Nonparametric Regression Model With Normalized Weights: A Study of Hippocampal Atrophy in Alzheimer’s Disease , 2014 .

[46]  P. Müller,et al.  Bayesian curve fitting using multivariate normal mixtures , 1996 .

[47]  S. J. Arnold,et al.  THE MEASUREMENT OF SELECTION ON CORRELATED CHARACTERS , 1983, Evolution; international journal of organic evolution.

[48]  D. Dunson,et al.  Kernel stick-breaking processes. , 2008, Biometrika.

[49]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[50]  Jonathan J. Forster,et al.  Bayesian model determination for multivariate ordinal and binary data , 2008, Comput. Stat. Data Anal..

[51]  David B. Dunson,et al.  Improving prediction from dirichlet process mixtures via enrichment , 2014, J. Mach. Learn. Res..

[52]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[53]  S. Mukhopadhyay,et al.  BAYESIAN ANALYSIS OF BINARY REGRESSION USING SYMMETRIC AND ASYMMETRIC LINKS , 2000 .