AutoConjunction: Adaptive Model-based Feature Conjunction for CTR Prediction

Click-through rate (CTR) prediction is an important topic in mobile recommendation systems and computational advertising. As previous research indicates, a key point to maximize CTR is feature conjunction for making the training data more informative. Despite great progress, existing methods still fail to choose suitable settings of feature conjunction for the given data. In particular, a linear model on the pair-wise feature conjunction may overfit the training set if the data set is highly sparse. For such data, a model based on low-rank latent matrices are shown to be more appropriate. Unfortunately, practitioners now face difficulties to decide when to use which. In this paper, we propose an adaptive framework to address the feature conjunction problem. Our proposed framework adaptively chooses effective models to do feature conjunction according to data properties. We offer a case for building feature conjunction based on feature-pair frequency. Efficient training, as well as parameter selection, are thoroughly investigated. We conduct comprehensive online and offline experiments to demonstrate the effectiveness of the adaptive model over existing models for CTR prediction.

[1]  Evangelia Christakopoulou,et al.  Local Item-Item Models For Top-N Recommendation , 2016, RecSys.

[2]  Yunming Ye,et al.  DeepFM: An End-to-End Wide & Deep Learning Framework for CTR Prediction , 2018, ArXiv.

[3]  Chih-Jen Lin,et al.  Training and Testing Low-degree Polynomial Data Mappings via Linear SVM , 2010, J. Mach. Learn. Res..

[4]  Rómer Rosales,et al.  Simple and Scalable Response Prediction for Display Advertising , 2014, ACM Trans. Intell. Syst. Technol..

[5]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[6]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[7]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[8]  Joaquin Quiñonero Candela,et al.  Practical Lessons from Predicting Clicks on Ads at Facebook , 2014, ADKDD'14.

[9]  H. Brendan McMahan,et al.  Follow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization , 2011, AISTATS.

[10]  Sebastian Prillo An Elementary View on Factorization Machines , 2017, RecSys.

[11]  Alexander J. Smola,et al.  ACCAMS: Additive Co-Clustering to Approximate Matrices Succinctly , 2014, WWW.

[12]  Chih-Jen Lin,et al.  Field-aware Factorization Machines for CTR Prediction , 2016, RecSys.

[13]  Steffen Rendle,et al.  Factorization Machines with libFM , 2012, TIST.

[14]  Tat-Seng Chua,et al.  Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks , 2017, IJCAI.

[15]  Martin Wattenberg,et al.  Ad click prediction: a view from the trenches , 2013, KDD.

[16]  Brian D. Davison,et al.  Co-factorization machines: modeling user interests and predicting individual decisions in Twitter , 2013, WSDM.

[17]  Alexander J. Smola,et al.  DiFacto: Distributed Factorization Machines , 2016, WSDM.

[18]  John R. Anderson,et al.  Beyond Globally Optimal: Focused Learning for Improved Recommendations , 2017, WWW.

[19]  Lars Schmidt-Thieme,et al.  Pairwise interaction tensor factorization for personalized tag recommendation , 2010, WSDM '10.

[20]  Olivier Chapelle,et al.  Field-aware Factorization Machines in a Real-world Online Advertising System , 2017, WWW.

[21]  Philip S. Yu,et al.  Multi-view Machines , 2015, WSDM.