Balancing Automated Procedures for Confounding Control with Background Knowledge

In the publication by Patorno et al.1 found in this issue of Epidemiology, the authors illustrate the importance of using subject matter knowledge to complement the automated high-dimensional propensity score (hdPS) algorithm when controlling for confounding in studies based on claims data with few exposed outcomes. The topic of variable selection for PS models in settings involving large numbers of potential confounders has received considerable attention in recent years. This interest is in part due to the uncertainty in determining what role automated procedures should play in the variable selection process. With large healthcare databases becoming increasingly used in epidemiology,2–4 automated procedures can be beneficial in selecting potential confounders that are unknown to the investigator.5–7 Further, the application of automated procedures is likely to expand as safety surveillance receives more attention as part of the Food and Drug Administration’s Sentinel Initiative.8 In these settings, automated procedures, such as the hdPS, can increase the speed and efficiency of active surveillance.7 With an increasing need for automated methods for confounding control in these areas of epidemiologic research, the question becomes: how should investigators balance automated procedures with the use of subject matter knowledge?

[1]  S. Merhar,et al.  Letter to the editor , 2005, IEEE Communications Magazine.

[2]  M Alan Brookhart,et al.  Covariate selection in high-dimensional propensity score analyses of treatment effects in small samples. , 2011, American journal of epidemiology.

[3]  J. Pearl,et al.  Causal diagrams for epidemiologic research. , 1999, Epidemiology.

[4]  Sebastian Schneeweiss,et al.  Using high‐dimensional propensity scores to automate confounding control in a distributed medical product safety surveillance system , 2012, Pharmacoepidemiology and drug safety.

[5]  M Alan Brookhart,et al.  The implications of propensity score variable selection strategies in pharmacoepidemiology: an empirical illustration , 2011, Pharmacoepidemiology and drug safety.

[6]  J. Robins Data, Design, and Background Knowledge in Etiologic Inference , 2001, Epidemiology.

[7]  G. Shaw,et al.  Maternal pesticide exposure from multiple sources and selected congenital anomalies. , 1999 .

[8]  R. Platt,et al.  The new Sentinel Network--improving the evidence of medical-product safety. , 2009, The New England journal of medicine.

[9]  C. O'Connell,et al.  The TEMPI syndrome--a novel multisystem disease. , 2011, The New England journal of medicine.

[10]  Jun Liu,et al.  Studies with Many Covariates and Few Outcomes: Selecting Covariates and Implementing Propensity-Score–Based Confounding Adjustments , 2014, Epidemiology.

[11]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[12]  J. Myers,et al.  Effects of adjusting for instrumental variables on bias and precision of effect estimates. , 2011, American journal of epidemiology.

[13]  W. Ray,et al.  Population-based studies of adverse drug effects. , 2003, The New England journal of medicine.

[14]  J. Rassen,et al.  Simultaneously assessing intended and unintended treatment effects of multiple treatment options: a pragmatic “matrix design” , 2011, Pharmacoepidemiology and drug safety.

[15]  J. Pearl Invited commentary: understanding bias amplification. , 2011, American journal of epidemiology.

[16]  S. Schneeweiss,et al.  Practice of Epidemiology Implications of M Bias in Epidemiologic Studies: a Simulation Study , 2022 .

[17]  J. Avorn,et al.  High-dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Health Care Claims Data , 2009, Epidemiology.

[18]  M. Hernán,et al.  Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. , 2002, American journal of epidemiology.

[19]  W. Ray,et al.  Use of disease risk scores in pharmacoepidemiologic studies , 2009, Statistical methods in medical research.

[20]  J. Avorn,et al.  A review of uses of health care utilization databases for epidemiologic research on therapeutics. , 2005, Journal of clinical epidemiology.

[21]  J. Rassen,et al.  Re: Confounding adjustment via a semi-automated high-dimensional propensity score algorithm: an application to electronic medical records. , 2011, Pharmacoepidemiology and drug safety.

[22]  Sengwee Toh,et al.  Confounding adjustment via a semi‐automated high‐dimensional propensity score algorithm: an application to electronic medical records , 2011, Pharmacoepidemiology and drug safety.

[23]  Judea Pearl,et al.  Comment on ‘Causal inference, probability theory, and graphical insights’ by Stuart G. Baker , 2013, Statistics in medicine.