Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning in Targeted Display Advertising

In most real-world large-scale online applications (e.g., e-commerce or finance), customer acquisition is usually a multi-step conversion process of audiences. For example, an impression->click->purchase process is usually performed of audiences for e-commerce platforms. However, it is more difficult to acquire customers in financial advertising (e.g., credit card advertising) than in traditional advertising. On the one hand, the audience multi-step conversion path is longer, an impression->click->application->approval->activation process usually occurs during the audience conversion for credit card business in financial advertising. On the other hand, the positive feedback is sparser (class imbalance) step by step, and it is difficult to obtain the final positive feedback due to the delayed feedback of activation. Therefore, it is necessary to use the positive feedback information of the former step to alleviate the class imbalance of the latter step. Multi-task learning is a typical solution in this direction. While considerable multi-task efforts have been made in this direction, a long-standing challenge is how to explicitly model the long-path sequential dependence among audience multi-step conversions for improving the end-to-end conversion. In this paper, we propose an Adaptive Information Transfer Multi-task (AITM) framework, which models the sequential dependence among audience multi-step conversions via the Adaptive Information Transfer (AIT) module. The AIT module can adaptively learn what and how much information to transfer for different conversion stages. Besides, by combining the Behavioral Expectation Calibrator in the loss function, the AITM framework can yield more accurate end-to-end conversion identification. The proposed framework is deployed in Meituan app, which utilizes it to real-timely show a banner to the audience with a high end-to-end conversion rate for Meituan Co-Branded Credit Cards. Offline experimental results on both industrial and public real-world datasets clearly demonstrate that the proposed framework achieves significantly better performance compared with state-of-the-art baselines. Besides, online experiments also demonstrate significant improvement compared with existing online models. Furthermore,we have released the source code of the proposed framework at https://github.com/xidongbo/AITM.

[1]  Geoffrey E. Hinton,et al.  Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.

[2]  Fuzhen Zhuang,et al.  Modeling the Field Value Variations and Field Interactions Simultaneously for Fraud Detection , 2021, AAAI.

[3]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[4]  Li Wei,et al.  Recommending what video to watch next: a multitask ranking system , 2019, RecSys.

[5]  Hod Lipson,et al.  Convergent Learning: Do different neural networks learn the same representations? , 2015, FE@NIPS.

[6]  M. W Gardner,et al.  Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences , 1998 .

[7]  Tat-Seng Chua,et al.  Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks , 2017, IJCAI.

[8]  Zhe Chen,et al.  Multitask Mixture of Sequential Experts for User Activity Streams , 2020, KDD.

[9]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[10]  Andrew J. Davison,et al.  End-To-End Multi-Task Learning With Attention , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Chen Gao,et al.  Neural Multi-task Recommendation from Multi-behavior Data , 2018, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[12]  Philip S. Yu,et al.  Learning Multiple Tasks with Multilinear Relationship Networks , 2015, NIPS.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Tat-Seng Chua,et al.  Neural Factorization Machines for Sparse Predictive Analytics , 2017, SIGIR.

[15]  Yongxin Yang,et al.  Deep Multi-task Representation Learning: A Tensor Factorisation Approach , 2016, ICLR.

[16]  Ming Zhao,et al.  Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations , 2020, RecSys.

[17]  Shuai Chen,et al.  Modeling Users’ Behavior Sequences with Hierarchical Explainable Network for Cross-domain Fraud Detection , 2020, WWW.

[18]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[19]  Fuzhen Zhuang,et al.  Modelling of Bi-Directional Spatio-Temporal Dependence and Users' Dynamic Preferences for Missing POI Check-In Identification , 2019, AAAI.

[20]  Marc'Aurelio Ranzato,et al.  Learning Factored Representations in a Deep Mixture of Experts , 2013, ICLR.

[21]  Qiang Yang,et al.  An Overview of Multi-task Learning , 2018 .

[22]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[23]  Xiao Ma,et al.  Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate , 2018, SIGIR.

[24]  Chen Gao,et al.  Learning to Recommend With Multiple Cascading Behaviors , 2018, IEEE Transactions on Knowledge and Data Engineering.

[25]  Shuai Chen,et al.  Neural Hierarchical Factorization Machines for User's Event Sequence Analysis , 2020, SIGIR.

[26]  Fuzhen Zhuang,et al.  Domain Adaptation with Category Attention Network for Deep Sentiment Analysis , 2020, WWW.

[27]  Yunming Ye,et al.  DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.

[28]  Jing Zhang,et al.  Entire Space Multi-Task Modeling via Post-Click Behavior Decomposition for Conversion Rate Prediction , 2019, SIGIR.

[29]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[30]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[31]  Bowen Du,et al.  Multiple Relational Attention Network for Multi-task Learning , 2019, KDD.

[32]  Zhe Zhao,et al.  Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts , 2018, KDD.

[33]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[34]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.