Matching via Dimensionality Reduction for Estimation of Treatment Effects in Digital Marketing Campaigns

A widely used method for estimating counterfactuals and causal treatment effects from observational data is nearest-neighbor matching. This typically involves pairing each treated unit with its nearest-in-covariates control unit, and then estimating an average treatment effect from the set of matched pairs. Although straightforward to implement, this estimator is known to suffer from a bias that increases with the dimensionality of the covariate space, which can be undesirable in applications that involve high-dimensional data. To address this problem, we propose a novel estimator that first projects the data to a number of random linear subspaces, and it then estimates the median treatment effect by nearest-neighbor matching in each subspace. We empirically compute the mean square error of the proposed estimator using semi-synthetic data, and we demonstrate the method on real-world digital marketing campaign data. The results show marked improvement over baseline methods.

[1]  Petra E. Todd,et al.  Matching As An Econometric Evaluation Estimator , 1998 .

[2]  Marco Caliendo,et al.  Some Practical Guidance for the Implementation of Propensity Score Matching , 2005, SSRN Electronic Journal.

[3]  Patrik O. Hoyer,et al.  Data-driven covariate selection for nonparametric estimation of causal effects , 2013, AISTATS.

[4]  D. Rubin Matched Sampling for Causal Effects: Matching to Remove Bias in Observational Studies , 1973 .

[5]  F. R. Rosendaal,et al.  Prediction , 2015, Journal of thrombosis and haemostasis : JTH.

[6]  Foster Provost,et al.  Causally motivated attribution for online advertising , 2012, ADKDD '12.

[7]  J. Pearl Causal inference in statistics: An overview , 2009 .

[8]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[9]  W. G. Cochran,et al.  Controlling Bias in Observational Studies: A Review. , 1974 .

[10]  Richard K. Crump,et al.  Nonparametric Tests for Treatment Effect Heterogeneity , 2006, The Review of Economics and Statistics.

[11]  Paul R. Rosenbaum,et al.  Comparison of Multivariate Matching Methods: Structures, Distances, and Algorithms , 1993 .

[12]  Kj Love,et al.  A political analysis , 2003 .

[13]  Rémi Munos,et al.  Linear regression with random projections , 2012, J. Mach. Learn. Res..

[14]  Jian Yang,et al.  Causal Inference via Sparse Additive Models with Application to Online Advertising , 2015, AAAI.

[15]  Ricardo Silva,et al.  Causal Inference through a Witness Protection Program , 2014, J. Mach. Learn. Res..

[16]  Margaret E. Roberts,et al.  Matching Methods for High-Dimensional Data with Applications to Text∗ , 2015 .

[17]  R. Fildes Journal of the American Statistical Association : William S. Cleveland, Marylyn E. McGill and Robert McGill, The shape parameter for a two variable graph 83 (1988) 289-300 , 1989 .

[18]  J. Pearl 3. The Foundations of Causal Inference , 2010 .

[19]  Jian Yang,et al.  Rethink Targeting: Detect 'Smart Cheating' in Online Advertising through Causal Inference , 2015, WWW.

[20]  K. Pearson,et al.  Biometrika , 1902, The American Naturalist.

[21]  Manuel Wiesenfarth,et al.  The Finite Sample Performance of Semi- and Nonparametric Estimators for Treatment Effects and Policy Evaluation , 2017, Comput. Stat. Data Anal..

[22]  Heikki Mannila,et al.  Random projection in dimensionality reduction: applications to image and text data , 2001, KDD '01.

[23]  Taylor Francis Online,et al.  The American statistician , 1947 .

[24]  Redaktionen THE REVIEW OF ECONOMIC STUDIES , 1960 .

[25]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[26]  Chinmay Hegde,et al.  Random Projections for Manifold Learning , 2007, NIPS.

[27]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[28]  D. Rubin,et al.  Using Multivariate Matched Sampling and Regression Adjustment to Control Bias in Observational Studies , 1978 .

[29]  Rong Jin,et al.  A Random Matrix Approach to Differential Privacy and Structure Preserved Social Network Graph Publishing , 2013, ArXiv.

[30]  Bernard Chazelle,et al.  Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.

[31]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[32]  Jennifer L. Hill,et al.  Assessing lack of common support in causal inference using bayesian nonparametrics: Implications for evaluating the effect of breastfeeding on children's cognitive outcomes , 2013, 1311.7244.

[33]  B. Jean Mandernach,et al.  Journal on Educational Psychology , 2014 .

[34]  Gary King,et al.  The Dangers of Extreme Counterfactuals , 2006, Political Analysis.

[35]  Illtyd Trethowan Causality , 1938 .

[36]  R. Pearl Biometrics , 1914, The American Naturalist.

[37]  Rong Ge,et al.  Evaluating online ad campaigns in a pipeline: causal models at scale , 2010, KDD.

[38]  Rajeev Dehejia,et al.  Propensity Score-Matching Methods for Nonexperimental Causal Studies , 2002, Review of Economics and Statistics.

[39]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[40]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[41]  D. Rubin,et al.  Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score , 1985 .

[42]  R. J. Donaldson,et al.  Annual Review of Public Health , 1987 .

[43]  Eran Omri,et al.  A Practical Application of Differential Privacy to Personalized Online Advertising , 2011, IACR Cryptol. ePrint Arch..

[44]  G. King,et al.  Causal Inference without Balance Checking: Coarsened Exact Matching , 2012, Political Analysis.

[45]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[46]  G. Imbens,et al.  Large Sample Properties of Matching Estimators for Average Treatment Effects , 2004 .

[47]  S. Goodman,et al.  Causal inference in public health. , 2013, Annual review of public health.

[48]  Jens Hainmueller,et al.  Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies , 2012, Political Analysis.

[49]  Ram Akella,et al.  Estimating Ad Impact on Clicker Conversions for Causal Attribution: A Potential Outcomes Approach , 2015, SDM.

[50]  S. Purdon,et al.  The use of propensity score matching in the evaluation of active labour market policies , 2002 .

[51]  Christopher Winship,et al.  Counterfactuals and Causal Inference: Methods and Principles for Social Research , 2007 .

[52]  Dimitris Achlioptas,et al.  Database-friendly random projections , 2001, PODS.

[53]  Deborah Peikes,et al.  Propensity Score Matching: A Note of Caution for Evaluators of Social Programs , 2008 .

[54]  Foster J. Provost,et al.  Measuring Causal Impact of Online Actions via Natural Experiments: Application to Display Advertising , 2015, KDD.

[55]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[56]  Jacob W. Crandall,et al.  Twenty-Ninth AAAI Conference on Artificial Intelligence , 2015, AAAI 2015.

[57]  D. Rubin Matched Sampling for Causal Effects: The Use of Matched Sampling and Regression Adjustment to Remove Bias in Observational Studies , 1973 .