UKD: Debiasing Conversion Rate Estimation via Uncertainty-regularized Knowledge Distillation

In online advertising, conventional post-click conversion rate (CVR) estimation models are trained using clicked samples. However, during online serving the models need to estimate for all impression ads, leading to the sample selection bias (SSB) issue. Intuitively, providing reliable supervision signals for unclicked ads is a feasible way to alleviate the SSB issue. This paper proposes an uncertaintyregularized knowledge distillation (UKD) framework to debias CVR estimation via distilling knowledge from unclicked ads. A teacher model learns click-adaptive representations and produces pseudoconversion labels on unclicked ads as supervision signals. Then a student model is trained on both clicked and unclicked ads with knowledge distillation, performing uncertainty modeling to alleviate the inherent noise in pseudo-labels. Experiments on billion-scale datasets show that UKD outperforms previous debiasing methods. Online results verify that UKD achieves significant improvements.

[1]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[2]  Yuta Saito,et al.  Unbiased Recommender Learning from Missing-Not-At-Random Implicit Feedback , 2020, WSDM.

[3]  Huasheng Liu,et al.  Deep Bayesian Multi-Target Learning for Recommender Systems , 2019, ArXiv.

[4]  Zhedong Zheng,et al.  Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation , 2021, Int. J. Comput. Vis..

[5]  Yale Song,et al.  Learning from Noisy Labels with Distillation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Tao Mei,et al.  Regularizing Proxies with Multi-Adversarial Training for Unsupervised Domain-Adaptive Semantic Segmentation , 2019, ArXiv.

[7]  Thorsten Joachims,et al.  Recommendations as Treatments: Debiasing Learning and Evaluation , 2016, ICML.

[8]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[9]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[10]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[11]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[12]  Bianca Zadrozny,et al.  Learning and evaluating classifiers under sample selection bias , 2004, ICML.

[13]  John Langford,et al.  Doubly Robust Policy Evaluation and Learning , 2011, ICML.

[14]  Wenhao Zhang,et al.  Large-scale Causal Approaches to Debiasing Post-click Conversion Rate Estimation with Multi-task Learning , 2019, WWW.

[15]  Xiuqiang He,et al.  A General Knowledge Distillation Framework for Counterfactual Recommendation via Uniform Data , 2020, SIGIR.

[16]  Xiao Ma,et al.  Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate , 2018, SIGIR.

[17]  Junmo Kim,et al.  A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[19]  Rui Zhang,et al.  Doubly Robust Joint Learning for Recommendation on Data Missing Not at Random , 2019, ICML.

[20]  Harald Steck,et al.  Training and testing of recommender systems on data missing not at random , 2010, KDD.

[21]  Chih-Jen Lin,et al.  Field-aware Factorization Machines for CTR Prediction , 2016, RecSys.

[22]  Yi Chang,et al.  Enhanced Doubly Robust Learning for Debiasing Post-Click Conversion Rate Estimation , 2021, SIGIR.

[23]  Xiangnan He,et al.  AutoDebias: Learning to Debias for Recommendation , 2021, SIGIR.

[24]  Yunming Ye,et al.  DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.

[25]  Kuang-chih Lee,et al.  AutoHERI: Automated Hierarchical Representation Integration for Post-Click Conversion Rate Estimation , 2021, CIKM.

[26]  Yoshua Bengio,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[27]  Hongbo Deng,et al.  ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance , 2020, SIGIR.

[28]  Chih-Jen Lin,et al.  Improving Ad Click Prediction by Considering Non-displayed Events , 2019, CIKM.

[29]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[31]  Jin Tian,et al.  Recovering from Selection Bias in Causal and Statistical Inference , 2014, AAAI.

[32]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.