Statistical matching and subclassification with a continuous dose: characterization, algorithms, and inference

Subclassification and matching are often used to adjust for observed covariates in observational studies; however, they are largely restricted to relatively simple study designs with a binary treatment. One important exception is Lu et al.(2001), who considered optimal pair matching with a continuous treatment dose. In this article, we propose two criteria for optimal subclassification/full matching based on subclass homogeneity with a continuous treatment dose, and propose an efficient polynomial-time algorithm that is guaranteed to find an optimal subclassification with respect to one criterion and serves as a 2-approximation algorithm for the other criterion. We discuss how to incorporate treatment dose and use appropriate penalties to control the number of subclasses in the design. Via extensive simulations, we systematically examine the performance of our proposed method, and demonstrate that combining our proposed subclassification scheme with regression adjustment helps reduce model dependence for parametric causal inference with a continuous treatment dose. We illustrate the new design and how to conduct randomization-based statistical inference under the new design using Medicare and Medicaid claims data to study the effect of transesophageal echocardiography (TEE) during CABG surgery on patients' 30-day mortality rate.

[1]  Dylan S. Small,et al.  War and Wages : The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases , 2007 .

[2]  Dylan S. Small,et al.  Near/far matching: a study design approach to instrumental variables , 2012, Health Services and Outcomes Research Methodology.

[3]  Paul R. Rosenbaum,et al.  Matching Methods for Observational Studies Derived from Large Administrative Databases , 2020 .

[4]  G. King,et al.  Multivariate Matching Methods That Are Monotonic Imbalance Bounding , 2011 .

[5]  Elaine L. Zanutto,et al.  Matching With Doses in an Observational Study of a Media Campaign Against Drug Abuse , 2001, Journal of the American Statistical Association.

[7]  M. Pagano,et al.  On Obtaining Permutation Distributions in Polynomial Time , 1983 .

[8]  Paul R. Rosenbaum,et al.  Balanced Risk Set Matching , 2001 .

[9]  Gary King,et al.  Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference , 2007, Political Analysis.

[10]  Dylan S. Small,et al.  War and Wages , 2008 .

[11]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[12]  Paul R. Rosenbaum,et al.  Modern Algorithms for Matching in Observational Studies , 2020, Annual Review of Statistics and Its Application.

[13]  Paul R. Rosenbaum,et al.  Design sensitivity in observational studies , 2004 .

[14]  Dylan S Small,et al.  Optimal Matching with Minimal Deviation from Fine Balance in a Study of Obesity and Surgical Outcomes , 2012, Biometrics.

[15]  Peng Ding,et al.  Randomization inference for treatment effect variation , 2014, 1412.5000.

[16]  T. Shakespeare,et al.  Observational Studies , 2003 .

[17]  P. Rosenbaum A Characterization of Optimal Designs for Observational Studies , 1991 .

[18]  R. D. Alley,et al.  The society of thoracic surgeons. , 1976, The Annals of thoracic surgery.

[19]  S. Lemeshow,et al.  Triplet Matching for Estimating Causal Effects With Three Treatment Arms: A Comparative Study of Mortality by Trauma Center Level , 2020 .

[20]  D. Rubin The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials , 2007, Statistics in medicine.

[21]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[22]  Dylan S. Small,et al.  Increasing Power for Observational Studies of Aberrant Response: An Adaptive Approach , 2019 .

[23]  B. Hansen,et al.  Optimal Full Matching and Related Designs via Network Flows , 2006 .

[24]  Jasjeet S. Sekhon,et al.  Generalized Full Matching , 2017, Political Analysis.

[25]  Samuel D. Pimentel,et al.  Large, Sparse Optimal Matching With Refined Covariate Balance in an Observational Study of the Health Outcomes Produced by New Surgeons , 2015, Journal of the American Statistical Association.

[26]  Emily J. Mackay,et al.  Protocol for a Retrospective, Comparative Effectiveness Study of the Association Between Transesophageal Echocardiography (TEE) Monitoring Used in Coronary Artery Bypass Graft (CABG) Surgery and Clinical Outcomes , 2020, medRxiv.

[27]  Janet S. Wright,et al.  2011 ACCF/AHA Guideline for Coronary Artery Bypass Graft Surgery. A report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines. Developed in collaboration with the American Association for Thoracic Surgery, Society of Cardiovascular Anesthesi , 2011, Journal of the American College of Cardiology.

[28]  D. Rubin,et al.  Using Multivariate Matched Sampling and Regression Adjustment to Control Bias in Observational Studies , 1978 .

[29]  J. Zubizarreta Journal of the American Statistical Association Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure after Surgery Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure after Surgery , 2022 .

[30]  D. Rubin Matched Sampling for Causal Effects: Matching to Remove Bias in Observational Studies , 1973 .

[31]  Alexander Schrijver,et al.  Combinatorial optimization. Polyhedra and efficiency. , 2003 .

[32]  Paul R. Rosenbaum,et al.  Robust, accurate confidence intervals with a weak instrument: quarter of birth and education , 2005 .

[33]  J. Sekhon,et al.  Genetic Matching for Estimating Causal Effects: A General Multivariate Matching Method for Achieving Balance in Observational Studies , 2006, Review of Economics and Statistics.

[34]  B. Hansen Full Matching in an Observational Study of Coaching for the SAT , 2004 .

[35]  Jasjeet S. Sekhon,et al.  Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching Package for R , 2008 .

[36]  Dylan S. Small,et al.  Using Approximation Algorithms to Build Evidence Factors and Related Designs for Observational Studies , 2019, Journal of Computational and Graphical Statistics.

[37]  S. Kruger Design Of Observational Studies , 2016 .

[38]  Ulrich Derigs,et al.  Solving non-bipartite matching problems via shortest path techniques , 1988 .

[39]  Paul R. Rosenbaum,et al.  Optimal Matching for Observational Studies , 1989 .

[40]  Gary King,et al.  MatchIt: Nonparametric Preprocessing for Parametric Causal Inference , 2011 .

[41]  P. Rosenbaum,et al.  Minimum Distance Matched Sampling With Fine Balance in an Observational Study of Treatment for Ovarian Cancer , 2007 .

[42]  David P. Williamson,et al.  The Design of Approximation Algorithms , 2011 .

[43]  M. Dwass Modified Randomization Tests for Nonparametric Hypotheses , 1957 .

[44]  Sanjay Basu,et al.  Near-Far Matching in R: The nearfar Package. , 2018, Journal of statistical software.

[45]  Dylan S. Small,et al.  Re-Evaluating Strengthened-IV Designs: Asymptotic Efficiency, Bias Formula, and the Validity and Power of Sensitivity Analyses , 2019, 1911.09171.

[46]  Xinyi Xu,et al.  Optimal Nonbipartite Matching and Its Statistical Applications , 2011, The American statistician.

[47]  Dylan S. Small,et al.  Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants , 2010 .