Stronger instruments via integer programming in an observational study of late preterm birth outcomes

In an optimal nonbipartite match, a single population is divided into matched pairs to minimize a total distance within matched pairs. Nonbipartite matching has been used to strengthen instrumental variables in observational studies of treatment effects, essentially by forming pairs that are similar in terms of covariates but very different in the strength of encouragement to accept the treatment. Optimal nonbipartite matching is typically done using network optimization techniques that can be quick, running in polynomial time, but these techniques limit the tools available for matching. Instead, we use integer programming techniques, thereby obtaining a wealth of new tools not previously available for nonbipartite matching, including fine and near-fine balance for several nominal variables, forced near balance on means and optimal subsetting. We illustrate the methods in our on-going study of outcomes of late-preterm births in California, that is, births of 34 to 36 weeks of gestation. Would lengthening the time in the hospital for such births reduce the frequency of rapid readmissions? A straightforward comparison of babies who stay for a shorter or longer time would be severely biased, because the principal reason for a long stay is some serious health problem. We need an instrument, something inconsequential and haphazard that encourages a shorter or a longer stay in the hospital. It turns out that babies born at certain times of day tend to stay overnight once with a shorter length of stay, whereas babies born at other times of day tend to stay overnight twice with a longer length of stay, and there is nothing particularly special about a baby who is born at 11:00 pm.

[1]  Dimitri P. Bertsekas,et al.  A new algorithm for the assignment problem , 1981, Math. Program..

[2]  Dylan S. Small,et al.  War and Wages : The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases , 2007 .

[3]  E. Keeler,et al.  Do longer postpartum stays reduce newborn readmissions? Analysis using instrumental variables. , 2000, Health services research.

[4]  Bo Lu Propensity Score Matching with Time‐Dependent Covariates , 2005, Biometrics.

[5]  K E Warner,et al.  Smoking and lung cancer: an overview. , 1984, Cancer research.

[6]  G. Imbens,et al.  On the Failure of the Bootstrap for Matching Estimators , 2006 .

[7]  Ulrich Derigs,et al.  Solving non-bipartite matching problems via shortest path techniques , 1988 .

[8]  Dylan S Small,et al.  Optimal Matching with Minimal Deviation from Fine Balance in a Study of Obesity and Surgical Outcomes , 2012, Biometrics.

[9]  Xinyi Xu,et al.  Optimal Nonbipartite Matching and Its Statistical Applications , 2011, The American statistician.

[10]  E. C. Hammond,et al.  Smoking and lung cancer: recent evidence and a discussion of some questions. , 1959, Journal of the National Cancer Institute.

[11]  G. Gadbury Randomization Inference and Bias of Standard Errors , 2001 .

[12]  S. Marcus,et al.  Using Omitted Variable Bias to Assess Uncertainty in the Estimation of an AIDS Education Treatment Effect , 1997 .

[13]  Gary King,et al.  Matching Methods for Causal Inference , 2011 .

[14]  Paul W. Holland,et al.  The sensitivity of linear regression coefficients' confidence limits to the omission of a confounder , 2009, 0905.3463.

[15]  Jack Edmonds,et al.  Maximum matching and a polyhedron with 0,1-vertices , 1965 .

[16]  D. Rubin,et al.  Assessing Sensitivity to an Unobserved Binary Covariate in an Observational Study with Binary Outcome , 1983 .

[17]  S. Lorch,et al.  Adherence to Discharge Guidelines for Late-Preterm Newborns , 2011, Pediatrics.

[18]  Abba M Krieger,et al.  Causal conclusions are most sensitive to unobserved binary covariates , 2006, Statistics in medicine.

[19]  P. Rosenbaum,et al.  Dual and simultaneous sensitivity analysis for matched pairs , 1998 .

[20]  Paul R. Rosenbaum,et al.  Optimal Matching for Observational Studies , 1989 .

[21]  Takashi Yanagawa,et al.  Case-control studies: Assessing the effect of a confounding factor , 1984 .

[22]  P. Rosenbaum,et al.  Amplification of Sensitivity Analysis in Matched Observational Studies , 2009, Journal of the American Statistical Association.

[23]  Paul R. Rosenbaum,et al.  Sensitivity Analysis for Equivalence and Difference in an Observational Study of Neonatal Intensive Care Units , 2009 .

[24]  Paul R Rosenbaum,et al.  Attributing Effects to Treatment in Matched Observational Studies , 2002 .

[25]  T. DiPrete,et al.  7. Assessing Bias in the Estimation of Causal Effects: Rosenbaum Bounds on Matching Estimators and Instrumental Variables Estimation with Imperfect Instruments , 2004 .

[26]  H. P. Tappan Of the sensitivity. , 1840 .

[27]  J. I The Design of Experiments , 1936, Nature.

[28]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[29]  David A. Jaeger,et al.  Problems with Instrumental Variables Estimation when the Correlation between the Instruments and the Endogenous Explanatory Variable is Weak , 1995 .

[30]  Dylan S. Small,et al.  War and Wages , 2008 .

[31]  D. Almond,et al.  After Midnight: A Regression Discontinuity Design in Length of Postpartum Hospital Stays WEB APPENDIX , 2010 .

[32]  D. Rubin,et al.  Outcome-free Design of Observational Studies: Peer Influence on Smoking , 2008 .

[33]  P. Holland CAUSAL INFERENCE, PATH ANALYSIS AND RECURSIVE STRUCTURAL EQUATIONS MODELS , 1988 .

[34]  Rachel R Kelz,et al.  Matching for Several Sparse Nominal Variables in a Case-Control Study of Readmission Following Surgery , 2011, The American statistician.

[35]  Meinhard Kieser,et al.  A unifying approach for confidence intervals and testing of equivalence and difference , 1996 .

[36]  G. Nemhauser,et al.  Integer Programming , 2020 .

[37]  T D WILSON,et al.  Smoking and Lung Cancer , 1960, Journal of the Irish Medical Association.

[38]  D. Rubin [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies , 1990 .

[39]  E. Mackenzie,et al.  On Estimation of the Survivor Average Causal Effect in Observational Studies When Important Confounders Are Missing Due to Death , 2009, Biometrics.

[40]  R. Cooke,et al.  BAYESIAN SENSITIVITY ANALYSIS , 2001 .

[41]  P. Gustafson,et al.  Bayesian sensitivity analysis for unmeasured confounding in observational studies , 2007, Statistics in medicine.

[42]  D. Rubin BIAS REDUCTION USING MAHALANOBIS METRIC MATCHING , 1978 .

[43]  Jin Tian,et al.  On the Identification of Causal Effects , 2015 .

[44]  Jerome P. Reiter Using Statistics to Determine Causal Relationships , 2000, Am. Math. Mon..

[45]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[46]  G. Imbens,et al.  Large Sample Properties of Matching Estimators for Average Treatment Effects , 2004 .

[47]  Binbing Yu,et al.  Sensitivity analysis for trend tests: application to the risk of radiation exposure. , 2005, Biostatistics.

[48]  Dylan S. Small,et al.  Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants , 2010 .

[49]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[50]  Paul R. Rosenbaum,et al.  Optimal Matching of an Optimally Chosen Subset in Observational Studies , 2012 .

[51]  Jens Vygen,et al.  The Book Review Column1 , 2020, SIGACT News.

[52]  Joshua D. Angrist,et al.  Identification of Causal Effects Using Instrumental Variables , 1993 .

[53]  Elizabeth,et al.  Matching Methods for Causal Inference , 2007 .

[54]  G. W. Imbens Sensitivity to Exogeneity Assumptions in Program Evaluation , 2003 .

[55]  P. Rosenbaum,et al.  Minimum Distance Matched Sampling With Fine Balance in an Observational Study of Treatment for Ovarian Cancer , 2007 .

[56]  B. L. Welch ON THE z-TEST IN RANDOMIZED BLOCKS AND LATIN SQUARES , 1937 .

[57]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[58]  Joseph P. Romano,et al.  Large Sample Confidence Regions Based on Subsamples under Minimal Assumptions , 1994 .