Using Approximation Algorithms to Build Evidence Factors and Related Designs for Observational Studies

Abstract Observational or nonrandomized studies of treatment effects are often constructed with the aid of polynomial-time algorithms that optimally form matched treatment-control pairs or matched sets. Because each observational comparison may potentially be affected by bias, investigators often reinforce a single comparison with an additional comparison that is unlikely to be affected by the same biases, for instance using multiple control groups or evidence factors or control + instrument designs. Use of two comparisons affected by different biases may detect bias if the two comparisons disagree, or may show that two comparisons with different weakness concur in their conclusions. Even this simplest addition—a second comparison—creates design problems without polynomial-time solutions. Faced with a problem that no polynomial-time algorithm can solve, a so-called approximation algorithm is a type of compromise: it provides a solution in polynomial time that is provably not much worse than the unattainable optimal solution. Building upon existing techniques for related problems in operations research, we develop an approximation algorithm for minimum distance matching with near-fine balance for three comparison groups. This algorithm is a practical approach to most observational designs that add a second comparison. The method is applied to an observational study of the effects of side airbags on injury severity in the U.S. Fatality Analysis Reporting System. For many car makes and models, side airbags were initially unavailable, then later available as optional equipment for an additional fee, then still later provided as standard equipment. Within sets matched for make and model of car, for safety belt use, for direction of impact, and other covariates, we compare crashes in these three periods, where each comparison has different limitations. The method is implemented in the R package approxmatch, whose example reproduces some of the calculations. Supplementary materials for this article are available online.

[1]  Ulrich Derigs,et al.  Solving non-bipartite matching problems via shortest path techniques , 1988 .

[2]  William P. Pierskalla,et al.  Letter to the Editor - The Multidimensional Assignment Problem , 1968, Oper. Res..

[3]  Paul R. Rosenbaum,et al.  How to See More in Observational Studies: Some New Quasi-Experimental Devices , 2015 .

[4]  Yves Crama,et al.  Approximation algorithms for three-dimensional assignment problems with triangle inequalities , 1992 .

[5]  Frank Yoon,et al.  Variable-ratio matching with fine balance in a study of the Peer Health Exchange. , 2015, Statistics in medicine.

[6]  Dylan S. Small,et al.  The Differential Impact of Delivery Hospital on the Outcomes of Premature Infants , 2012, Pediatrics.

[7]  P. Rosenbaum Testing hypotheses in order , 2008 .

[8]  Alla R. Kammerdiner Multidimensional Assignment Problem , 2009, Encyclopedia of Optimization.

[9]  Paul R. Rosenbaum,et al.  Some Approximate Evidence Factors in Observational Studies , 2011 .

[10]  Bo Lu,et al.  Functions for Optimal Non-Bipartite Matching , 2016 .

[11]  Kosuke Imai,et al.  Causal Inference With General Treatment Regimes , 2004 .

[12]  B S Weir,et al.  Truncated product method for combining P‐values , 2002, Genetic epidemiology.

[13]  Paul R. Rosenbaum,et al.  Sensitivity analysis for stratified comparisons in an observational study of the effect of smoking on homocysteine levels , 2018, The Annals of Applied Statistics.

[14]  Paul R. Rosenbaum,et al.  Optimal Matching for Observational Studies , 1989 .

[15]  B. Hansen,et al.  Optimal Full Matching and Related Designs via Network Flows , 2006 .

[16]  P. Rosenbaum,et al.  Minimum Distance Matched Sampling With Fine Balance in an Observational Study of Treatment for Ovarian Cancer , 2007 .

[17]  M. Baiocchi,et al.  Instrumental variable methods for causal inference , 2014, Statistics in medicine.

[18]  Paul R Rosenbaum,et al.  Attributable Effects in Case2‐Studies , 2005, Biometrics.

[19]  Scott Bennett How Strong is Strong Enough , 2007 .

[20]  Elizabeth,et al.  Matching Methods for Causal Inference , 2007 .

[21]  Dylan S Small,et al.  Optimal Matching with Minimal Deviation from Fine Balance in a Study of Obesity and Surgical Outcomes , 2012, Biometrics.

[22]  Luke Keele,et al.  How strong is strong enough? Strengthening instruments through matching and weak instrument tests , 2016 .

[23]  Xinyi Xu,et al.  Optimal Nonbipartite Matching and Its Statistical Applications , 2011, The American statistician.

[24]  Dylan S. Small,et al.  Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants , 2010 .

[25]  Paul R. Rosenbaum,et al.  The General Structure of Evidence Factors in Observational Studies , 2017 .

[26]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[27]  David P. Williamson,et al.  The Design of Approximation Algorithms , 2011 .

[28]  Rocío Titiunik,et al.  Enhancing a geographic regression discontinuity design through matching to estimate the effect of ballot initiatives on voter turnout , 2015 .

[29]  Joshua D. Angrist,et al.  Identification of Causal Effects Using Instrumental Variables , 1993 .

[30]  Matthew D. McHugh,et al.  Comparison of the Value of Nursing Work Environments in Hospitals Across Different Levels of Patient Risk. , 2016, JAMA surgery.

[31]  J. Zubizarreta Journal of the American Statistical Association Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure after Surgery Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure after Surgery , 2022 .