A simple way to assess inference methods

We propose a simple way to assess the quality of asymptotic approximations required for inference methods. Our assessment can detect problems when the asymptotic theory that justifies the inference method is invalid and/or provides a poor approximation given the design of the empirical application. It can be easily applied to a wide range of applications. If widely used by applied researchers, this assessment has the potential of substantially reducing the number of papers that are published based on misleading inference. We analyze in detail the cases of differences in differences with few treated cluster, stratified experiments, shift-share designs, and matching estimators.

[1]  F. Eicker Limit Theorems for Regressions with Unequal and Dependent Errors , 1967 .

[2]  James G. MacKinnon,et al.  Wild Bootstrap Inference for Wildly Different Cluster Sizes , 2017 .

[3]  Joseph P. Romano,et al.  Inference in Experiments With Matched Pairs , 2019, Journal of the American Statistical Association.

[4]  R. Kaestner Changes in Mortality After Massachusetts Health Care Reform , 2015, Annals of Internal Medicine.

[5]  Stephen E. Fienberg,et al.  Testing Statistical Hypotheses , 2005 .

[6]  M. Kolesár,et al.  Shift-Share Designs: Theory and Inference , 2018, The Quarterly Journal of Economics.

[7]  Guido W. Imbens,et al.  Inference for Misspecified Models With Fixed Regressors , 2014 .

[8]  Bruno Ferman,et al.  Inference in Differences-in-Differences with Few Treated Groups and Heteroskedasticity , 2019, Review of Economics and Statistics.

[9]  B. Kovak Regional Effects of Trade Reform: What Is the Correct Measure of Liberalization? , 2013 .

[10]  Bruno Ferman,et al.  Inference in Difference‐in‐Differences: How Much Should We Trust in Independent Clusters? , 2019, Journal of Applied Econometrics.

[11]  Joseph P. Romano,et al.  Randomization Tests Under an Approximate Symmetry Assumption , 2017 .

[12]  E. Miguel,et al.  Transparency, Reproducibility, and the Credibility of Economics Research , 2016, Journal of Economic Literature.

[13]  Inference in Misspecified Models , 2011 .

[14]  Ivan A. Canay,et al.  The Wild Bootstrap with a “Small” Number of “Large” Clusters , 2019, Review of Economics and Statistics.

[15]  Did Massachusetts Health Care Reform Lower Mortality? No According to Randomization Inference , 2016 .

[16]  H. Herzog Who Benefits from State and Local Economic Development Policies , 1992 .

[17]  Gordon H. Hanson,et al.  The China Syndrome: Local Labor Market Effects of Import Competition in the United States , 2012, SSRN Electronic Journal.

[18]  Jeffrey M. Woodbridge Econometric Analysis of Cross Section and Panel Data , 2002 .

[19]  J. MacKinnon,et al.  Asymptotic theory and wild bootstrap inference with clustered errors , 2019, Journal of Econometrics.

[20]  The Wild Bootstrap with a 'Small' Number of 'Large' Clusters , 2019 .

[21]  G. Imbens,et al.  Large Sample Properties of Matching Estimators for Average Treatment Effects , 2004 .

[22]  A. Coppock,et al.  Declaring and Diagnosing Research Designs , 2019, American Political Science Review.

[23]  Amanda Kowalski,et al.  The Impact of Health Care Reform on Hospital and Preventive Care: Evidence from Massachusetts , 2010, Journal of public economics.

[24]  Bruno Ferman Matching estimators with few treated and many control observations , 2017, Journal of Econometrics.

[25]  M. Lechner,et al.  The Finite Sample Performance of Estimators for Mediation Analysis Under Sequential Conditional Independence , 2016 .

[26]  Rory A. Fisher,et al.  THE COMPARISON OF SAMPLES WITH POSSIBLY UNEQUAL VARIANCES , 1939 .

[27]  Lawrence F. Katz,et al.  Regional Evolutions , 2007 .

[28]  Daron Acemoglu,et al.  Robots and Jobs: Evidence from US Labor Markets , 2017, Journal of Political Economy.

[29]  B. Hansen,et al.  Asymptotic Theory for Clustered Samples , 2017, Journal of Econometrics.

[31]  Sarah Miller The Impact of the Massachusetts Health Care Reform on Health Care Use Among Children. , 2012, The American economic review.

[32]  L. J. Savage,et al.  The nonexistence of certain statistical procedures in nonparametric problems , 1956 .

[33]  Susan Athey,et al.  When Should You Adjust Standard Errors for Clustering? , 2017, The Quarterly Journal of Economics.

[34]  Susan Athey,et al.  Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis , 2017, Econometrica.

[35]  Oru,et al.  MOSTLY HARMLESS SIMULATIONS? USING MONTE CARLO STUDIES FOR ESTIMATOR SELECTION∗ , 2019 .

[36]  David Card,et al.  Immigrant Inflows, Native Outflows, and the Local Labor Market Impacts of Higher Immigration , 1997, Journal of Labor Economics.

[37]  Xiaotong Niu Health Insurance and Self-Employment , 2014 .

[38]  Jean-Marie Dufour,et al.  Monte Carlo Test Methods in Econometrics , 2007 .

[40]  Timothy G. Conley,et al.  Inference with “Difference in Differences” with a Small Number of Policy Changes , 2005, The Review of Economics and Statistics.

[41]  Antoine Deeb,et al.  Clustering and External Validity in Randomized Controlled Trials , 2019, SSRN Electronic Journal.

[42]  Alwyn Young Consistency without Inference: Instrumental Variables in Practical Application , 2019, European Economic Review.

[43]  W. Newey,et al.  A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelationconsistent Covariance Matrix , 1986 .

[44]  Alwyn Young Channeling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results* , 2018, The Quarterly Journal of Economics.

[45]  M. Arellano,et al.  Computing Robust Standard Errors for Within-Groups Estimators , 2009 .

[46]  Abel Brodeur,et al.  Star Wars: The Empirics Strike Back , 2012, SSRN Electronic Journal.

[47]  J. Roth Pre-test with Caution: Event-study Estimates After Testing for Parallel Trends , 2019 .

[48]  C. de Chaisemartin,et al.  At What Level Should One Cluster Standard Errors in Paired Experiments, and in Stratified Experiments with Small Strata? , 2019, SSRN Electronic Journal.

[49]  Matías Busso,et al.  New Evidence on the Finite Sample Properties of Propensity Score Reweighting and Matching Estimators , 2014, Review of Economics and Statistics.

[50]  B. Kovak,et al.  Trade Liberalization and Regional Dynamics , 2016 .

[51]  James G. MacKinnon,et al.  When and How to Deal with Clustered Errors in Regression Models , 2020 .

[52]  Guido Imbens,et al.  Using Wasserstein Generative Adversarial Networks for the Design of Monte Carlo Simulations , 2019, Journal of Econometrics.

[53]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[54]  E. Duflo,et al.  How Much Should We Trust Differences-in-Differences Estimates? , 2001 .

[55]  A. Chesher,et al.  The Bias of a Heteroskedasticity Consistent Covariance Matrix Estimator , 1987 .

[56]  Andrew V. Carter,et al.  Asymptotic Behavior of a t-Test Robust to Cluster Heterogeneity , 2017, Review of Economics and Statistics.

[57]  Peter Hull,et al.  Quasi-Experimental Shift-Share Research Designs , 2018, The Review of economic studies.

[58]  Jeffrey M. Wooldridge,et al.  Cluster-Sample Methods in Applied Econometrics , 2003 .

[59]  Andreas Hagemann Inference with a single treated cluster , 2020, 2010.04076.

[60]  J. I The Design of Experiments , 1936, Nature.

[61]  Alwyn Young Improved , Nearly Exact , Statistical Inference with Robust and Clustered Covariance Matrices using Effective Degrees of Freedom Corrections , 2016 .

[62]  Edward Leamer Tantalus on the Road to Asymptopia , 2010 .

[63]  Tamara Broderick,et al.  An Automatic Finite-Sample Robustness Metric: Can Dropping a Little Data Change Conclusions? , 2020 .

[64]  Douglas L. Miller,et al.  A Practitioner’s Guide to Cluster-Robust Inference , 2015, The Journal of Human Resources.

[65]  H. Scheffé Practical Solutions of the Behrens-Fisher Problem , 1970 .

[66]  A. Heyes,et al.  Methods Matter: P-Hacking and Causal Inference in Economics , 2018, SSRN Electronic Journal.

[67]  H. White A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity , 1980 .

[68]  Does Universal Coverage Improve Health? The Massachusetts Experience , 2014 .

[69]  Timothy G. Conley GMM estimation with cross sectional dependence , 1999 .

[70]  Matthew D. Webb,et al.  Randomization inference for difference-in-differences with few treated clusters , 2020 .

[71]  P. J. Huber The behavior of maximum likelihood estimates under nonstandard conditions , 1967 .

[72]  T. Kitagawa,et al.  Mostly harmless simulations? Using Monte Carlo studies for estimator selection , 2018, Journal of Applied Econometrics.