Variance Reduction for Experiments with One-Sided Triggering using CUPED

In online experimentation, trigger-dilute analysis is an approach to obtain more precise estimates of intent-to-treat (ITT) effects when the intervention is only exposed, or "triggered", for a small subset of the population. Trigger-dilute analysis cannot be used for estimation when triggering is only partially observed. In this paper, we propose an unbiased ITT estimator with reduced variance for cases where triggeringstatus isonlyobserved in the treatmentgroup.Ourmethod is based on the efficiency augmentation idea of CUPED and draws upon identification frameworks from the principal stratification and instrumental variables literature.Theunbiasednessof our estimation approach relies on a testable assumption that an augmentation term used for covariate adjustment equals zero in expectation. When this augmentation termfailsamean-zero test,weshowhowourestimator can incorporate in-experiment observations to reduce the augmentation’s bias, by sacrificing the amount of variance reduced. This provides an explicit knob to trade offbiaswith variance.Wedemonstrate through simulations that our estimator can remain unbiased and achieve precision improvements as good as if triggering status were fully observed, and in somecases outperforms trigger-dilute analysis.

[1]  Qingyuan Zhao,et al.  Entropy Balancing is Doubly Robust , 2015, 1501.03571.

[2]  Pavel Dmitriev,et al.  Diagnosing Sample Ratio Mismatch in Online Controlled Experiments: A Taxonomy and Rules of Thumb for Practitioners , 2019, KDD.

[3]  Luke W. Miratrix,et al.  Identifying and estimating principal causal effects in a multi-site trial of Early College High Schools , 2019, The Annals of Applied Statistics.

[4]  Avi Feller,et al.  Principal Score Methods: Assumptions, Extensions, and Practical Considerations , 2017 .

[5]  Huizhi Xie,et al.  Improving the Sensitivity of Online Controlled Experiments: Case Studies at Netflix , 2016, KDD.

[6]  Ron Kohavi,et al.  Improving the sensitivity of online controlled experiments by utilizing pre-experiment data , 2013, WSDM.

[7]  Marshall M Joffe,et al.  Weighting in instrumental variables and G‐estimation , 2003, Statistics in medicine.

[8]  Joshua D. Angrist,et al.  Identification of Causal Effects Using Instrumental Variables , 1993 .

[9]  Elizabeth A Stuart,et al.  On the use of propensity scores in principal causal effect estimation , 2009, Statistics in medicine.

[10]  Ron Kohavi,et al.  Online controlled experiments at large scale , 2013, KDD.

[11]  Jiannan Lu,et al.  Principal stratification analysis using principal scores , 2016, 1602.01196.

[12]  Gleb Gusev,et al.  Boosted Decision Tree Regression Adjustment for Variance Reduction in Online Controlled Experiments , 2016, KDD.

[13]  G. Imbens,et al.  Approximate residual balancing: debiased inference of average treatment effects in high dimensions , 2016, 1604.07125.

[14]  P. Ding,et al.  Rerandomization and regression adjustment , 2019, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[15]  Nick Huntington-Klein Instruments with Heterogeneous Effects: Bias, Monotonicity, and Localness , 2020 .

[16]  D. Rubin,et al.  Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , 2016 .

[17]  Alex Deng,et al.  Applying the Delta Method in Metric Analytics: A Practical Guide with Novel Ideas , 2018, KDD.

[18]  J. Angrist,et al.  Identification and Estimation of Local Average Treatment Effects , 1995 .

[19]  Peng Ding,et al.  Identification of Causal Effects Within Principal Strata Using Auxiliary Variables , 2020, Statistical Science.

[20]  Luke W. Miratrix,et al.  Adjusting treatment effect estimates by post‐stratification in randomized experiments , 2013 .

[21]  Alex Deng,et al.  Diluted Treatment Effect Estimation for Trigger Analysis in Online Controlled Experiments , 2015, WSDM.

[22]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[23]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[24]  W. Lin,et al.  Agnostic notes on regression adjustments to experimental data: Reexamining Freedman's critique , 2012, 1208.2301.

[25]  D. Rubin,et al.  Principal Stratification in Causal Inference , 2002, Biometrics.

[26]  Jann Spiess,et al.  Improving Inference from Simple Instruments through Compliance Estimation , 2021, 2108.03726.

[27]  Larry Wasserman,et al.  All of Statistics: A Concise Course in Statistical Inference , 2004 .

[28]  K. Imai,et al.  Covariate balancing propensity score , 2014 .

[29]  Peng Ding,et al.  Multiply robust estimation of causal effects under principal ignorability. , 2020 .

[30]  Jens Hainmueller,et al.  Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies , 2012, Political Analysis.

[31]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[32]  Ron Kohavi,et al.  Practical guide to controlled experiments on the web: listen to your customers not to the hippo , 2007, KDD '07.

[33]  Matt Goldman,et al.  Machine Learning for Variance Reduction in Online Experiments , 2021, NeurIPS.