Heteroscedastic BART Using Multiplicative Regression Trees

BART (Bayesian Additive Regression Trees) has become increasingly popular as a flexible and scalable nonparametric regression approach for modern applied statistics problems. For the practitioner dealing with large and complex nonlinear response surfaces, its advantages include a matrix-free formulation and the lack of a requirement to prespecify a confining regression basis. Although flexible in fitting the mean, BART has been limited by its reliance on a constant variance error model. This homoscedastic assumption is unrealistic in many applications. Alleviating this limitation, we propose HBART, a nonparametric heteroscedastic elaboration of BART. In BART, the mean function is modeled with a sum of trees, each of which determines an additive contribution to the mean. In HBART, the variance function is further modeled with a product of trees, each of which determines a multiplicative contribution to the variance. Like the mean model, this flexible, multidimensional variance model is entirely nonparametric with no need for the prespecification of a confining basis. Moreover, with this enhancement, HBART can provide insights into the potential relationships of the predictors with both the mean and the variance. Practical implementations of HBART with revealing new diagnostic plots are demonstrated with simulated and real data on used car prices, fishing catch production and alcohol consumption.

[1]  Michael I. Jordan,et al.  Regression with input-dependent noise: A Gaussian process treatment , 1998 .

[2]  Jared S. Murray,et al.  Log-Linear Bayesian Additive Regression Trees for Categorical and Count Responses , 2017, 1701.01503.

[3]  Hongzhe Li,et al.  High‐Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis , 2012, Biometrics.

[4]  D. Ruppert,et al.  Transformation and Weighting in Regression , 1988 .

[5]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[6]  R. Christensen,et al.  Fisher Lecture: Dimension Reduction in Regression , 2007, 0708.3774.

[7]  Jianhua Z. Huang,et al.  A full scale approximation of covariance functions for large spatial data sets , 2012 .

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[10]  Genevera I. Allen,et al.  Journal of the American Statistical Association a Generalized Least-square Matrix Decomposition a Generalized Least-square Matrix Decomposition , 2022 .

[11]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[12]  Donald Kenkel,et al.  The effect of physician advice on alcohol consumption: count regression with an endogenous treatment effect , 2001 .

[13]  Edward I. George,et al.  Bayesian Treed Models , 2002, Machine Learning.

[14]  Richard A. Johnson,et al.  A new family of power transformations to improve normality or symmetry , 2000 .

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  Eduardo Ley,et al.  Bayesian modelling of catch in a north‐west Atlantic fishery , 2002 .

[17]  Robert B. Gramacy,et al.  Dynamic Trees for Learning and Design , 2009, 0912.1586.

[18]  Robert B. Gramacy,et al.  Ja n 20 08 Bayesian Treed Gaussian Process Models with an Application to Computer Modeling , 2009 .

[19]  M. Pratola Efficient Metropolis–Hastings Proposal Mechanisms for Bayesian Regression Tree Models , 2013, 1312.1895.

[20]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[21]  Adam Kapelner,et al.  Bayesian Additive Regression Trees With Parametric Models of Heteroskedasticity , 2014 .

[22]  V. Rocková,et al.  Posterior Concentration for Bayesian Regression Trees and their Ensembles , 2017 .

[23]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[24]  Adrian F. M. Smith,et al.  A Bayesian CART algorithm , 1998 .

[25]  Maria L. Rizzo,et al.  TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION , 2004 .

[26]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .