Optimal Full Matching and Related Designs via Network Flows

In the matched analysis of an observational study, confounding on covariates X is addressed by comparing members of a distinguished group (Z = 1) to controls (Z = 0) only when they belong to the same matched set. The better matchings, therefore, are those whose matched sets exhibit both dispersion in Z and uniformity in X. For dispersion in Z, pair matching is best, creating matched sets that are equally balanced between the groups; but actual data place limits, often severe limits, on matched pairs' uniformity in X. At the other extreme is full matching, the matched sets of which are as uniform in X as can be, while often so poorly dispersed in Z as to sacrifice efficiency. This article presents an algorithm for exploring the intermediate territory. Given requirements on matched sets' uniformity in X and dispersion in Z, the algorithm first decides the requirements' feasibility. In feasible cases, it furnishes a match that is optimal for X-uniformity among matches with Z-dispersion as stipulated. To illustrate, we describe the algorithm's use in a study comparing womens' to mens' working conditions; and we compare our method to a commonly used alternative, greedy matching, which is neither optimal nor as flexible but is algorithmically much simpler. The comparison finds meaningful advantages, in terms of both bias and efficiency, for our more studied approach.

[1]  P. Hall On Representatives of Subsets , 1935 .

[2]  M. Kendall Theoretical Statistics , 1956, Nature.

[3]  W. Haenszel,et al.  Statistical aspects of the analysis of data from retrospective studies of disease. , 1959, Journal of the National Cancer Institute.

[4]  J. L. Hodges,et al.  Estimates of Location Based on Rank Tests , 1963 .

[5]  W. G. Cochran The effectiveness of adjustment by subclassification in removing bias in observational studies. , 1968, Biometrics.

[6]  Donald B. Rubin,et al.  Multivariate matching methods that are equal percent bias reducing , 1974 .

[7]  W. G. Cochran,et al.  Controlling Bias in Observational Studies: A Review. , 1974 .

[8]  Donald B. Rubin,et al.  Multivariate matching methods that are equal percent bias reducing , 1974 .

[9]  D. Rubin,et al.  Using Multivariate Matched Sampling and Regression Adjustment to Control Bias in Observational Studies , 1978 .

[10]  D. Rubin,et al.  Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score , 1985 .

[11]  P. Rosenbaum A Characterization of Optimal Designs for Observational Studies , 1991 .

[12]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[13]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[14]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[15]  Ravindra K. Ahuja,et al.  Network Flows: Theory, Algorithms, and Applications , 1993 .

[16]  Paul R. Rosenbaum,et al.  Comparison of Multivariate Matching Methods: Structures, Distances, and Algorithms , 1993 .

[17]  Dimitri P. Bertsekas,et al.  RELAX-IV : a faster version of the RELAX code for solving minimum cost flow problems , 1994 .

[18]  H. Engelhardt,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods.Anthony S. Bryk , Stephen W. Raudenbush , 1994 .

[19]  D B Rubin,et al.  Matching using estimated propensity scores: relating theory to practice. , 1996, Biometrics.

[20]  L. Goldman,et al.  The effectiveness of right heart catheterization in the initial care of critically ill patients. SUPPORT Investigators. , 1996, JAMA.

[21]  L. Goldman,et al.  The effectiveness of right heart catheterization in the initial care of critically ill patients. SUPPORT Investigators. , 1996, JAMA.

[22]  S. Olsen Multivariate matching with non-normal covariates in observational studies , 1997 .

[23]  Herbert L. Smith 6. Matching with Multiple Controls to Estimate Treatment Effects in Observational Studies , 1997 .

[24]  Petra E. Todd,et al.  Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme , 1997 .

[25]  T. Shakespeare,et al.  Observational Studies , 2003 .

[26]  D. Rubin,et al.  Combining Propensity Score Matching with Additional Adjustments for Prognostic Covariates , 2000 .

[27]  Paul R. Rosenbaum,et al.  A Note on Optimal Matching With Variable Controls Using the Assignment Algorithm , 2001 .

[28]  Jeffrey A. Smith,et al.  Does Matching Overcome Lalonde's Critique of Nonexperimental Estimators? , 2000 .

[29]  D. Harding Counterfactual Models of Neighborhood Effects: The Effect of Neighborhood Poverty on High School Dropout and Teenage Pregnancy* , 2002 .

[30]  P. Rosenbaum Covariance Adjustment in Randomized Experiments and Observational Studies , 2002 .

[31]  D. Harding Counterfactual Models of Neighborhood Effects: The Effect of Neighborhood Poverty on Dropping Out and Teenage Pregnancy1 , 2003, American Journal of Sociology.

[32]  B. Hansen Full Matching in an Observational Study of Coaching for the SAT , 2004 .

[33]  Jochen Kluve,et al.  Assessing the Performance of Matching Algorithms When Selection into Treatment is Strong , 2007, SSRN Electronic Journal.

[34]  Daniel E. Ho Why Affirmative Action Does Not Cause Black Students To Fail the Bar , 2005 .

[35]  Gary King,et al.  Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference , 2007, Political Analysis.