Heteroscedastic BART via Multiplicative Regression Trees

Abstract Bayesian additive regression trees (BART) has become increasingly popular as a flexible and scalable nonparametric regression approach for modern applied statistics problems. For the practitioner dealing with large and complex nonlinear response surfaces, its advantages include a matrix-free formulation and the lack of a requirement to prespecify a confining regression basis. Although flexible in fitting the mean, BART has been limited by its reliance on a constant variance error model. Alleviating this limitation, we propose HBART, a nonparametric heteroscedastic elaboration of BART. In BART, the mean function is modeled with a sum of trees, each of which determines an additive contribution to the mean. In HBART, the variance function is further modeled with a product of trees, each of which determines a multiplicative contribution to the variance. Like the mean model, this flexible, multidimensional variance model is entirely nonparametric with no need for the prespecification of a confining basis. Moreover, with this enhancement, HBART can provide insights into the potential relationships of the predictors with both the mean and the variance. Practical implementations of HBART with revealing new diagnostic plots are demonstrated with simulated and real data on used car prices and song year of release. Supplementary materials for this article are available online.

[1]  G. Székely,et al.  TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION , 2004 .

[2]  M. Faddy,et al.  Likelihood Computations for Extended Poisson Process Models , 1999 .

[3]  Hongzhe Li,et al.  High‐Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis , 2012, Biometrics.

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5]  Thierry Bertin-Mahieux,et al.  The Million Song Dataset , 2011, ISMIR.

[6]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[7]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[8]  Adam Kapelner,et al.  Bayesian Additive Regression Trees With Parametric Models of Heteroskedasticity , 2014 .

[9]  M. Pratola Efficient Metropolis–Hastings Proposal Mechanisms for Bayesian Regression Tree Models , 2013, 1312.1895.

[10]  Robert B. Gramacy,et al.  Dynamic Trees for Learning and Design , 2009, 0912.1586.

[11]  Donald Kenkel,et al.  The effect of physician advice on alcohol consumption: count regression with an endogenous treatment effect , 2001 .

[12]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[13]  R. Christensen,et al.  Fisher Lecture: Dimension Reduction in Regression , 2007, 0708.3774.

[14]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[15]  R. Koenker,et al.  Regression Quantiles , 2007 .

[16]  H. Chipman,et al.  Bayesian Additive Regression Trees , 2006 .

[17]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[18]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[19]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[20]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[21]  V. Rocková,et al.  Posterior Concentration for Bayesian Regression Trees and their Ensembles , 2017 .

[22]  Richard A. Johnson,et al.  A new family of power transformations to improve normality or symmetry , 2000 .

[23]  Eduardo Ley,et al.  Bayesian modelling of catch in a north‐west Atlantic fishery , 2002 .

[24]  Paul W. Goldberg,et al.  Regression with Input-dependent Noise: A Gaussian Process Treatment , 1997, NIPS.

[25]  Robert B. Gramacy,et al.  Ja n 20 08 Bayesian Treed Gaussian Process Models with an Application to Computer Modeling , 2009 .

[26]  Adrian F. M. Smith,et al.  A Bayesian CART algorithm , 1998 .

[27]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[28]  Edward I. George,et al.  Bayesian Treed Models , 2002, Machine Learning.

[29]  Jared S. Murray,et al.  Log-Linear Bayesian Additive Regression Trees for Categorical and Count Responses , 2017, 1701.01503.

[30]  Jianhua Z. Huang,et al.  A full scale approximation of covariance functions for large spatial data sets , 2012 .

[31]  Genevera I. Allen,et al.  Journal of the American Statistical Association a Generalized Least-square Matrix Decomposition a Generalized Least-square Matrix Decomposition , 2022 .

[32]  R. Koenker Quantile Regression: Name Index , 2005 .

[33]  D. Ruppert,et al.  Transformation and Weighting in Regression , 1988 .