Information Borrowing in Regression Models

Model development often takes data structure, subject matter considerations, model assumptions, and goodness of fit into consideration. To diagnose issues with any of these factors, it can be helpful to understand regression model estimates at a more granular level. We propose a new method for decomposing point estimates from a regression model via weights placed on data clusters. The weights are informed only by the model specification and data availability and thus can be used to explicitly link the effects of data imbalance and model assumptions to actual model estimates. The weight matrix has been understood in linear models as the hat matrix in the existing literature. We extend it to Bayesian hierarchical regression models that incorporate prior information and complicated dependence structures through the covariance among random effects. We show that the model weights, which we call borrowing factors, generalize shrinkage and information borrowing to all regression models. In contrast, the focus of the hat matrix has been mainly on the diagonal elements indicating the amount of leverage. We also provide metrics that summarize the borrowing factors and are practically useful. We present the theoretical properties of the borrowing factors and associated metrics and demonstrate their usage in two examples. By explicitly quantifying borrowing and shrinkage, researchers can better incorporate domain knowledge and evaluate model performance and the impacts of data properties such as data imbalance or influential points.

[1]  W. W. Muir,et al.  Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1980 .

[2]  Andrew Gelman,et al.  Bayesian Measures of Explained Variance and Pooling in Multilevel (Hierarchical) Models , 2006, Technometrics.

[3]  Fadi Thabtah,et al.  Data imbalance in classification: Experimental evaluation , 2020, Inf. Sci..

[4]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[5]  Ribana Roscher,et al.  Explainable Machine Learning for Scientific Insights and Discoveries , 2019, IEEE Access.

[6]  Yan Liu,et al.  Detecting Statistical Interactions from Neural Network Weights , 2017, ICLR.

[7]  Duncan Lee,et al.  A spatio-temporal model for estimating the long-term effects of air pollution on respiratory hospital admissions in Greater London. , 2014, Spatial and spatio-temporal epidemiology.

[8]  L. Bao,et al.  The Value of Information in Retrospect , 2018, 1806.01458.

[9]  B. Efron,et al.  Stein's Estimation Rule and Its Competitors- An Empirical Bayes Approach , 1973 .

[10]  Daniel Peña,et al.  A New Statistic for Influence in Linear Regression , 2005, Technometrics.

[11]  S. MacEachern,et al.  Reconciling Curvature and Importance Sampling Based Procedures for Summarizing Case Influence in Bayesian Models , 2018, Journal of the American Statistical Association.

[12]  R. Dennis Cook,et al.  Detection of Influential Observation in Linear Regression , 2000, Technometrics.

[13]  Approximate Cross-validated Mean Estimates for Bayesian Hierarchical Regression Models , 2020, ArXiv.

[14]  Christopher Eager,et al.  Mixed Effects Models are Sometimes Terrible , 2017 .

[15]  Michael J. Daniels,et al.  A NOTE ON FIRST-STAGE APPROXIMATION IN TWO-STAGE HIERARCHICAL MODELS* , 2016 .

[16]  Ron Goeree,et al.  Bayesian Hierarchical Models Combining Different Study Types and Adjusting for Covariate Imbalances: A Simulation Study to Assess Model Performance , 2011, PloS one.

[17]  R. Snee Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1983 .

[18]  C. Morris Parametric Empirical Bayes Inference: Theory and Applications , 1983 .

[19]  R. Kass,et al.  Approximate Bayesian Inference in Conditionally Independent Hierarchical Models (Parametric Empirical Bayes Models) , 1989 .

[20]  B. Efron,et al.  Data Analysis Using Stein's Estimator and its Generalizations , 1975 .

[21]  J. Gabry,et al.  Bayesian Applied Regression Modeling via Stan , 2016 .

[22]  J. Neyman,et al.  INADMISSIBILITY OF THE USUAL ESTIMATOR FOR THE MEAN OF A MULTIVARIATE NORMAL DISTRIBUTION , 2005 .

[23]  Danai Koutra,et al.  GroupINN: Grouping-based Interpretable Neural Network for Classification of Limited, Noisy Brain Data , 2019, KDD.