Adversarial Counterfactual Learning and Evaluation for Recommender System

The feedback data of recommender systems are often subject to what was exposed to the users; however, most learning and evaluation methods do not account for the underlying exposure mechanism. We first show in theory that applying supervised learning to detect user preferences may end up with inconsistent results in the absence of exposure information. The counterfactual propensity-weighting approach from causal inference can account for the exposure mechanism; nevertheless, the partial-observation nature of the feedback data can cause identifiability issues. We propose a principled solution by introducing a minimax empirical risk formulation. We show that the relaxation of the dual problem can be converted to an adversarial game between two recommendation models, where the opponent of the candidate model characterizes the underlying exposure mechanism. We provide learning bounds and conduct extensive simulation studies to illustrate and justify the proposed approach over a broad range of recommendation settings, which shed insights on the various benefits of the proposed approach.

[1]  Shalabh Bhatnagar,et al.  Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games , 2015, AAMAS.

[2]  Deborah Estrin,et al.  Unbiased offline recommender evaluation for missing-not-at-random implicit feedback , 2018, RecSys.

[3]  D. Rubin,et al.  Nonstandard conditionally specified models for nonignorable missing data , 2016, Proceedings of the National Academy of Sciences.

[4]  Daniel Kuhn,et al.  Regularization via Mass Transportation , 2017, J. Mach. Learn. Res..

[5]  E. Stuart,et al.  Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies , 2015, Statistics in medicine.

[6]  Ian J. Goodfellow,et al.  NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[7]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[8]  W. Bruce Croft,et al.  Unbiased Learning to Rank with Unbiased Propensity Estimation , 2018, SIGIR.

[9]  H Tateishi,et al.  On Some Minimax Theorems , 1989 .

[10]  Thorsten Joachims,et al.  Recommendations as Treatments: Debiasing Learning and Evaluation , 2016, ICML.

[11]  M. Talagrand Upper and Lower Bounds for Stochastic Processes: Modern Methods and Classical Problems , 2014 .

[12]  Heng-Tze Cheng,et al.  Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[13]  Da Xu,et al.  Self-attention with Functional Time Representation Learning , 2019, NeurIPS.

[14]  Xi Chen,et al.  Wasserstein Distributional Robustness and Regularization in Statistical Learning , 2017, ArXiv.

[15]  Alexander D'Amour,et al.  Flexible Sensitivity Analysis for Observational Studies Without Observable Implications , 2018, Journal of the American Statistical Association.

[16]  David M. Blei,et al.  Causal Inference for Recommendation , 2016 .

[17]  S. Kruger Design Of Observational Studies , 2016 .

[18]  Alan L. Yuille,et al.  Feature Denoising for Improving Adversarial Robustness , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Dietmar Jannach,et al.  Are we really making much progress? A worrying analysis of recent neural recommendation approaches , 2019, RecSys.

[20]  J. Pearl Causal inference in statistics: An overview , 2009 .

[21]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[22]  G. Imbens,et al.  Estimation of Causal Effects using Propensity Score Weighting: An Application to Data on Right Heart Catheterization , 2001, Health Services and Outcomes Research Methodology.

[23]  Dan Frankowski,et al.  Collaborative Filtering Recommender Systems , 2007, The Adaptive Web.

[24]  Yi Tay,et al.  Deep Learning based Recommender System: A Survey and New Perspectives , 2018 .

[25]  Julian J. McAuley,et al.  Self-Attentive Sequential Recommendation , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[26]  Nathan Kallus,et al.  DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training , 2018, ICML.

[27]  Mihaela van der Schaar,et al.  GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets , 2018, ICLR.

[28]  Kun Zhang,et al.  Modeling Dynamic Missingness of Implicit Feedback for Recommendation , 2018, NeurIPS.

[29]  Guorui Zhou,et al.  Deep Interest Network for Click-Through Rate Prediction , 2017, KDD.

[30]  Zoubin Ghahramani,et al.  Probabilistic Matrix Factorization with Non-random Missing Data , 2014, ICML.

[31]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[32]  Yuta Saito,et al.  Unbiased Recommender Learning from Missing-Not-At-Random Implicit Feedback , 2020, WSDM.

[33]  Thorsten Joachims,et al.  Counterfactual Learning-to-Rank for Additive Metrics and Deep Models , 2018, ArXiv.

[34]  David M. Blei,et al.  Modeling User Exposure in Recommendation , 2015, WWW.

[35]  S. Shankar Sastry,et al.  Characterization and computation of local Nash equilibria in continuous games , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[36]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[37]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[38]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[39]  Peng Zhang,et al.  IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models , 2017, SIGIR.

[40]  Xiaoyu Du,et al.  Adversarial Personalized Ranking for Recommendation , 2018, SIGIR.

[41]  Sanjay Mehrotra,et al.  Distributionally Robust Optimization: A Review , 2019, ArXiv.

[42]  C. Villani Optimal Transport: Old and New , 2008 .

[43]  Nicholas Jing Yuan,et al.  DRN: A Deep Reinforcement Learning Framework for News Recommendation , 2018, WWW.

[44]  Michael I. Jordan,et al.  What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? , 2019, ICML.

[45]  Walid Krichene,et al.  Neural Collaborative Filtering vs. Matrix Factorization Revisited , 2020, RecSys.

[46]  Yoav Zemel,et al.  Statistical Aspects of Wasserstein Distances , 2018, Annual Review of Statistics and Its Application.

[47]  Thorsten Joachims,et al.  Unbiased Learning-to-Rank with Biased Feedback , 2016, WSDM.

[48]  P. Austin An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies , 2011, Multivariate behavioral research.

[49]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.