A Nonparametric Super-Efficient Estimator of the Average Treatment Effect

Doubly robust estimators of causal effects are a popular means of estimating causal effects. Such estimators combine an estimate of the conditional mean of the outcome given treatment and confounders (the so-called outcome regression) with an estimate of the conditional probability of treatment given confounders (the propensity score) to generate an estimate of the effect of interest. In addition to enjoying the double-robustness property, these estimators have additional benefits. First, flexible regression tools, such as those developed in the field of machine learning, can be utilized to estimate the relevant regressions, while the estimators of the treatment effects retain desirable statistical properties. Furthermore, these estimators are often statistically efficient, achieving the lower bound on the variance of regular, asymptotically linear estimators. However, in spite of their asymptotic optimality, in problems where causal estimands are weakly identifiable, these estimators may behave erratically. We propose two new estimation techniques for use in these challenging settings. Our estimators build on two existing frameworks for efficient estimation: targeted minimum loss estimation and one-step estimation. However, rather than using an estimate of the propensity score in their construction, we instead opt for an alternative regression quantity when building our estimators: the conditional probability of treatment given the conditional mean outcome. We discuss the theoretical implications and demonstrate the estimators' performance in simulated and real data.

[1]  Cheng Ju,et al.  Collaborative-controlled LASSO for constructing propensity score-based estimators in high-dimensional data , 2017, Statistical methods in medical research.

[2]  Mark J. van der Laan,et al.  Data-adaptive selection of the truncation level for Inverse-Probability-of-Treatment-Weighted estimators , 2008 .

[3]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[4]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .

[5]  H. White A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity , 1980 .

[6]  Antoine Chambaz,et al.  Scalable collaborative targeted learning for high-dimensional data , 2017, Statistical methods in medical research.

[7]  Alan E. Hubbard,et al.  Statistical Inference for Data Adaptive Target Parameters , 2016, The international journal of biostatistics.

[8]  M. Laan,et al.  Data-Adaptive Target Parameters , 2018 .

[9]  Kristin E. Porter,et al.  Diagnosing and responding to violations in the positivity assumption , 2012, Statistical methods in medical research.

[10]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[11]  J. Pfanzagl,et al.  CONTRIBUTIONS TO A GENERAL ASYMPTOTIC STATISTICAL THEORY , 1982 .

[12]  S. Sheather Density Estimation , 2004 .

[13]  Mark J. van der Laan,et al.  Finding Quantitative Trait Loci Genes , 2011 .

[14]  M. Petersen,et al.  Integrating Causal Modeling and Statistical Estimation , 2022 .

[15]  B. Popkin,et al.  Cohort profile: the Cebu longitudinal health and nutrition survey. , 2011, International journal of epidemiology.

[16]  M. J. van der Laan,et al.  The International Journal of Biostatistics Targeted Maximum Likelihood Learning , 2011 .

[17]  E. Moodie,et al.  Should a propensity score model be super? The utility of ensemble procedures for causal adjustment , 2018, Statistics in medicine.

[18]  J. Mark,et al.  Targeted estimation of nuisance parameters to obtain valid statistical inference. , 2014 .

[19]  K. Do,et al.  Efficient and Adaptive Estimation for Semiparametric Models. , 1994 .

[20]  Mark J van der Laan,et al.  Finding Quantitative Trait Loci Genes with Collaborative Targeted Maximum Likelihood Learning. , 2011, Statistics & probability letters.

[21]  Mark J van der Laan,et al.  An Application of Collaborative Targeted Maximum Likelihood Estimation in Causal Inference and Genomics , 2010, The international journal of biostatistics.

[22]  Jin Tian,et al.  A general identification condition for causal effects , 2002, AAAI/IAAI.

[23]  Ashkan Ertefaie,et al.  Outcome‐adaptive lasso: Variable selection for causal inference , 2017, Biometrics.

[24]  Lindsay N. Carpp,et al.  Prediction of VRC01 neutralization sensitivity by HIV-1 gp160 sequence features , 2019, PLoS Comput. Biol..

[25]  Hyejin Yoon,et al.  CATNAP: a tool to compile, analyze and tally neutralizing antibody panels , 2015, Nucleic Acids Res..

[26]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[27]  M. J. van der Laan A Generally Efficient Targeted Minimum Loss Based Estimator based on the Highly Adaptive Lasso , 2017, The international journal of biostatistics.

[28]  J Mark,et al.  A Generally Efficient Targeted Minimum Loss Based Estimator , 2017 .

[29]  Judea Pearl,et al.  Identification of Joint Interventional Distributions in Recursive Semi-Markovian Causal Models , 2006, AAAI.

[30]  F. Petraglia,et al.  Maternal risk factors for preterm birth: a country-based population analysis. , 2011, European journal of obstetrics, gynecology, and reproductive biology.

[31]  Susan Gruber,et al.  One-Step Targeted Minimum Loss-based Estimation Based on Universal Least Favorable One-Dimensional Submodels , 2016, The international journal of biostatistics.

[32]  Marco Carone,et al.  The Balance Super Learner: A robust adaptation of the Super Learner to improve estimation of the average treatment effect in the treated based on propensity score matching , 2018, Statistical methods in medical research.

[33]  Mark J. van der Laan,et al.  Super Learner In Prediction , 2010 .

[34]  Mark J van der Laan,et al.  The International Journal of Biostatistics Collaborative Targeted Maximum Likelihood for Time to Event Data , 2011 .

[35]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[36]  D. Politis,et al.  Statistical Estimation , 2022 .

[37]  Michal Abrahamowicz,et al.  Comparison of Approaches to Weight Truncation for Marginal Structural Cox Models , 2013 .

[38]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[39]  M. J. van der Laan Targeted Estimation of Nuisance Parameters to Obtain Valid Statistical Inference , 2014, The international journal of biostatistics.

[40]  Mark J. van der Laan,et al.  The Highly Adaptive Lasso Estimator , 2016, 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[41]  M. J. Laan,et al.  Doubly robust nonparametric inference on the average treatment effect , 2017, Biometrika.

[42]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[43]  M. J. van der Laan,et al.  Statistical Applications in Genetics and Molecular Biology Super Learner , 2010 .

[44]  Karel G M Moons,et al.  Missing covariate data in clinical research: when and when not to use the missing-indicator method for analysis , 2012, Canadian Medical Association Journal.

[45]  M. J. Laan,et al.  Targeted Learning: Causal Inference for Observational and Experimental Data , 2011 .

[46]  Allan C. deCamp,et al.  Basis and Statistical Design of the Passive HIV-1 Antibody Mediated Prevention (AMP) Test-of-Concept Efficacy Trials , 2017, Statistical communications in infectious diseases.

[47]  M. J. van der Laan,et al.  On adaptive propensity score truncation in causal inference , 2017, Statistical Methods in Medical Research.

[48]  M. J. van der Laan,et al.  Causal Models and Learning from Data: Integrating Causal Modeling and Statistical Estimation , 2014, Epidemiology.

[49]  Mark J. van der Laan,et al.  Cross-Validated Targeted Minimum-Loss-Based Estimation , 2011 .

[50]  Marco Carone,et al.  Prediction of VRC01 neutralization sensitivity by HIV-1 gp160 sequence features , 2019, PLoS Comput. Biol..

[51]  D. Ghosh,et al.  On estimating regression-based causal effects using sufficient dimension reduction , 2017 .