AutoGroup: Automatic Feature Grouping for Modelling Explicit High-Order Feature Interactions in CTR Prediction

Modelling feature interactions is key in Click-Through Rate (CTR) predictions. State-of-the-art models usually include explicit feature interactions to better model non-linearity in a deep network, but enumerating all feature combinations of high orders is not efficient and brings challenges to network optimization. In this work, we use AutoML to seek useful high-order feature interactions to train on without manual feature selection. For this purpose, an end-to-end model, AutoGroup, is proposed, which casts the selection of feature interactions as a structural optimization problem. In a nutshell, AutoGroup first automatically groups useful features into a number of feature sets. Then, it generates interactions of any order from these feature sets using a novel interaction function. The main contribution of AutoGroup is that it performs both dimensionality reduction and feature selection which are not seen in previous models. Offline experiments on three public large-scale benchmark datasets demonstrate the superior performance and efficiency of AutoGroup over state-of-the-art models. Furthermore, a ten-day online A/B test verifies that AutoGroup can be reliably deployed in production and outperform the commercial baseline by 10% on average in terms of CTR and CVR.

[1]  Chih-Jen Lin,et al.  Training and Testing Low-degree Polynomial Data Mappings via Linear SVM , 2010, J. Mach. Learn. Res..

[2]  Jun Wang,et al.  Product-Based Neural Networks for User Response Prediction , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[3]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[4]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[5]  Martin Wattenberg,et al.  Ad click prediction: a view from the trenches , 2013, KDD.

[6]  Guorui Zhou,et al.  Deep Interest Network for Click-Through Rate Prediction , 2017, KDD.

[7]  Bin Liu,et al.  Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction , 2019, WWW.

[8]  Jia Li,et al.  Latent Cross: Making Use of Context in Recurrent Recommender Systems , 2018, WSDM.

[9]  Yunming Ye,et al.  DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.

[10]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[11]  Tat-Seng Chua,et al.  Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks , 2017, IJCAI.

[12]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[13]  Yuandong Tian,et al.  FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[15]  Gang Fu,et al.  Deep & Cross Network for Ad Click Predictions , 2017, ADKDD@KDD.

[16]  Tat-Seng Chua,et al.  Neural Factorization Machines for Sparse Predictive Analytics , 2017, SIGIR.

[17]  Naiyan Wang,et al.  You Only Search Once: Single Shot Neural Architecture Search via Direct Sparse Optimization , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[19]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[20]  Tie-Yan Liu,et al.  Neural Architecture Optimization , 2018, NeurIPS.

[21]  Xing Xie,et al.  xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems , 2018, KDD.

[22]  Tom Minka,et al.  A* Sampling , 2014, NIPS.

[23]  Yu Sun,et al.  Field-weighted Factorization Machines for Click-Through Rate Prediction in Display Advertising , 2018, WWW.

[24]  Chang Zhou,et al.  Deep Interest Evolution Network for Click-Through Rate Prediction , 2018, AAAI.

[25]  Naonori Ueda,et al.  Higher-Order Factorization Machines , 2016, NIPS.

[26]  Jun Wang,et al.  Deep Learning over Multi-field Categorical Data - - A Case Study on User Response Prediction , 2016, ECIR.

[27]  Steffen Rendle,et al.  Factorization Machines with libFM , 2012, TIST.

[28]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[29]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.