Reducing Model Misspecification and Bias in the Estimation of Interactions

Analyzing variation in treatment effects across subsets of the population is an important way for social scientists to evaluate theoretical arguments. A common strategy in assessing such treatment effect heterogeneity is to include a multiplicative interaction term between the treatment and a hypothesized effect modifier in a regression model. Unfortunately, this approach can result in biased inferences due to unmodeled interactions between the effect modifier and other covariates, and including these interactions can lead to unstable estimates due to overfitting. In this paper, we explore the usefulness of machine learning algorithms for stabilizing these estimates and show how many off-the-shelf adaptive methods lead to two forms of bias: direct and indirect regularization bias. To overcome these issues, we use a post-double selection approach that utilizes several lasso estimators to select the interactions to include in the final model. We extend this approach to estimate uncertainty for both interaction and marginal effects. Simulation evidence shows that this approach has better performance than competing methods, even when the number of covariates is large. We show in two empirical examples that the choice of method leads to dramatically different conclusions about effect heterogeneity.

[1]  How Much Should We Trust Estimates from Multiplicative Interaction Models? Simple Tools to Improve Empirical Practice , 2018, Political Analysis.

[2]  Kirk Bansak A Generalized Framework for the Estimation of Causal Moderation Effects with Randomized Treatments and Non-Randomized Moderators , 2017, 1710.02954.

[3]  Victor Chernozhukov,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011 .

[4]  Randolph T. Stevenson,et al.  The causal interpretation of estimated associations in regression models , 2019, Political Science Research and Methods.

[5]  H. White,et al.  Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties☆ , 1985 .

[6]  J. M. Kousser,et al.  The Shaping of Southern Politics: Suffrage Restriction and the Establishment of the One-Party South, 1880-1910 , 1974 .

[7]  Marc Ratkovic,et al.  Sparse Estimation and Uncertainty with Application to Subgroup Analysis , 2017, Political Analysis.

[8]  Covadonga Meseguer,et al.  Remittances and Protest in Dictatorships , 2018, American Journal of Political Science.

[9]  Sören R. Künzel,et al.  Metalearners for estimating heterogeneous treatment effects using machine learning , 2017, Proceedings of the National Academy of Sciences.

[10]  J. Snyder,et al.  Primary Elections in the United States , 2019 .

[11]  Kirk Bansak Estimating causal moderation effects with randomized treatments and non‐randomized moderators , 2017, Journal of the Royal Statistical Society: Series A (Statistics in Society).

[12]  W. Rogers Struggle for Mastery: Disfranchisement in the South, 1888-1908 , 2001 .

[13]  Yiqing Xu,et al.  How Much Should We Trust Estimates from Multiplicative Interaction Models? Simple Tools to Improve Empirical Practice , 2016, Political Analysis.

[14]  Cindy D. Kam,et al.  Modeling and Interpreting Interactive Hypotheses in Regression Analysis , 2007 .

[15]  Marc Ratkovic,et al.  Estimating treatment effect heterogeneity in randomized program evaluation , 2013, 1305.5682.

[16]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[17]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[18]  James M. Robins,et al.  Multiply Robust Inference for Statistical Interactions , 2008, Journal of the American Statistical Association.

[19]  Jane Lawrence Sumner,et al.  Marginal Effects in Interaction Models: Determining and Controlling the False Positive Rate , 2018 .

[20]  Thomas Brambor,et al.  Understanding Interaction Models: Improving Empirical Analyses , 2006, Political Analysis.

[21]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[22]  Kernel Regularized Least Squares: Reducing Misspecification Bias with a Flexible and Interpretable Machine Learning Approach , 2014, Political Analysis.

[23]  Tyler J. VanderWeele,et al.  Explanation in Causal Inference: Methods for Mediation and Interaction , 2015 .

[24]  Janina Beiser-McGrath,et al.  Problems with products? Control strategies for models with interaction and quadratic effects , 2020, Political Science Research and Methods.

[25]  R. Tibshirani,et al.  A LASSO FOR HIERARCHICAL INTERACTIONS. , 2012, Annals of statistics.

[26]  J. Snyder,et al.  The Decline of Third-Party Voting in the United States , 2007, The Journal of Politics.

[27]  Bear F. Braumoeller Hypothesis Testing and Multiplicative Interaction Terms , 2004, International Organization.

[28]  S. Ansolabehere,et al.  What Did the Direct Primary Do to Party Loyalty in Congress , 2007 .

[29]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[30]  Victor Chernozhukov,et al.  Post-Selection Inference for Generalized Linear Models With Many Controls , 2013, 1304.3969.

[31]  Victor Chernozhukov,et al.  Uniform post-selection inference for least absolute deviation regression and other Z-estimation problems , 2013, 1304.0282.

[32]  A. Belloni,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011, 1201.0224.

[33]  Cindy D. Kam,et al.  At the Nexus of Observational and Experimental Research: Theory, Specification, and Analysis of Experiments with Heterogeneous Treatment Effects , 2017 .

[34]  S. Lahiri,et al.  Bootstrapping Lasso Estimators , 2011 .

[35]  John F. Reynolds The Demise of the American Convention System, 1880-1911 , 2006 .

[36]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009, 1001.0188.

[37]  William D. Berry,et al.  Testing for Interaction in Binary Logit and Probit Models: Is a Product Term Essential? , 2010 .

[38]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[39]  Christian Hansen,et al.  Inference in High-Dimensional Panel Models With an Application to Gun Control , 2014, 1411.6507.

[40]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.