Quaternion Factorization Machines: A Lightweight Solution to Intricate Feature Interaction Modelling

Due to the sparsity of available features in web-scale predictive analytics, combinatorial features become a crucial means for deriving accurate predictions. As a well-established approach, a factorization machine (FM) is capable of automatically learning high-order interactions among features to make predictions without the need for manual feature engineering. With the prominent development of deep neural networks (DNNs), there is a recent and ongoing trend of enhancing the expressiveness of FM-based models with DNNs. However, though better results are obtained with DNN-based FM variants, such performance gain is paid off by an enormous amount (usually millions) of excessive model parameters on top of the plain FM. Consequently, the heavy parameterization impedes the real-life practicality of those deep models, especially efficient deployment on resource-constrained Internet of Things (IoT) and edge devices. In this article, we move beyond the traditional real space where most deep FM-based models are defined and seek solutions from quaternion representations within the hypercomplex space. Specifically, we propose the quaternion factorization machine (QFM) and quaternion neural factorization machine (QNFM), which are two novel lightweight and memory-efficient quaternion-valued models for sparse predictive analytics. By introducing a brand new take on FM-based models with the notion of quaternion algebra, our models not only enable expressive inter-component feature interactions but also significantly reduce the parameter size due to lower degrees of freedom in the hypercomplex Hamilton product compared with real-valued matrix multiplication. Extensive experimental results on three large-scale datasets demonstrate that QFM achieves 4.36% performance improvement over the plain FM without introducing any extra parameters, while QNFM outperforms all baselines with up to two magnitudes' parameter size reduction in comparison to state-of-the-art peer methods.

[1]  Zhiru Zhang,et al.  Improving Neural Network Quantization without Retraining using Outlier Channel Splitting , 2019, ICML.

[2]  Jun Wang,et al.  Deep Learning over Multi-field Categorical Data - - A Case Study on User Response Prediction , 2016, ECIR.

[3]  Jian Tang,et al.  AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks , 2018, CIKM.

[4]  Yi Xu,et al.  Quaternion Convolutional Neural Networks , 2018, ECCV.

[5]  Brian D. Davison,et al.  Co-factorization machines: modeling user interests and predicting individual decisions in Twitter , 2013, WSDM.

[6]  Siu Cheung Hui,et al.  Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks , 2019, ACL.

[7]  Titouan Parcollet,et al.  A survey of quaternion neural networks , 2019, Artificial Intelligence Review.

[8]  Lina Yao,et al.  Quaternion Collaborative Filtering for Recommendation , 2019, IJCAI.

[9]  Lars Schmidt-Thieme,et al.  Fast context-aware recommendations with factorization machines , 2011, SIGIR.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Yang Chen,et al.  Interpretable Click-Through Rate Prediction through Hierarchical Attention , 2020, WSDM.

[12]  Alex Graves,et al.  Associative Long Short-Term Memory , 2016, ICML.

[13]  Tat-Seng Chua,et al.  Neural Factorization Machines for Sparse Predictive Analytics , 2017, SIGIR.

[14]  Dong Yu,et al.  Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features , 2016, KDD.

[15]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Heng-Tze Cheng,et al.  Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[18]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[19]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[20]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[21]  Xing Xie,et al.  xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems , 2018, KDD.

[22]  Wen-Chih Peng,et al.  Sequence-Aware Factorization Machines for Temporal Predictive Analytics , 2019, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[23]  Liang Wang,et al.  Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction , 2019, CIKM.

[24]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[25]  Bin Liu,et al.  Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction , 2019, WWW.

[26]  Alex Beutel,et al.  Recurrent Recommender Networks , 2017, WSDM.

[27]  Yoshua Bengio,et al.  Unitary Evolution Recurrent Neural Networks , 2015, ICML.

[28]  Danilo P. Mandic,et al.  Quaternion-Valued Echo State Networks , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Guangzhong Sun,et al.  Practical Lessons for Job Recommendations in the Cold-Start Scenario , 2017, RecSys 2017.

[30]  Lin Wu,et al.  TADA: Trend Alignment with Dual-Attention Multi-task Recurrent Neural Networks for Sales Prediction , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[31]  Titouan Parcollet,et al.  Quaternion Recurrent Neural Networks , 2018, ICLR.

[32]  Yiqun Liu,et al.  Efficient Non-Sampling Factorization Machines for Optimal Context-Aware Recommendation , 2020, WWW.

[33]  Lina Yao,et al.  Quaternion Knowledge Graph Embeddings , 2019, NeurIPS.

[34]  Titouan Parcollet,et al.  Quaternion Convolutional Neural Networks for Heterogeneous Image Processing , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[35]  Gang Fu,et al.  Deep & Cross Network for Ad Click Predictions , 2017, ADKDD@KDD.

[36]  Guorui Zhou,et al.  Deep Interest Network for Click-Through Rate Prediction , 2017, KDD.

[37]  Kai Zheng,et al.  Origin-Destination Matrix Prediction via Graph Convolution: a New Perspective of Passenger Demand Modeling , 2019, KDD.

[38]  Chih-Jen Lin,et al.  Field-aware Factorization Machines for CTR Prediction , 2016, RecSys.

[39]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[40]  Danilo P. Mandic,et al.  Quaternion-Valued Nonlinear Adaptive Filtering , 2011, IEEE Transactions on Neural Networks.

[41]  Yoshua Bengio,et al.  An empirical analysis of dropout in piecewise linear networks , 2013, ICLR.

[42]  Lina Yao,et al.  Holographic Factorization Machines for Recommendation , 2019, AAAI.

[43]  Jun Wang,et al.  Product-Based Neural Networks for User Response Prediction , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[44]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[45]  J. Kuipers Quaternions and Rotation Sequences , 1998 .

[46]  Tat-Seng Chua,et al.  Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks , 2017, IJCAI.

[47]  Yunming Ye,et al.  DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.

[48]  Naonori Ueda,et al.  Higher-Order Factorization Machines , 2016, NIPS.

[49]  Danilo Comminiello,et al.  Quaternion Convolutional Neural Networks for Detection and Localization of 3D Sound Events , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[50]  Titouan Parcollet,et al.  Quaternion Neural Networks for Spoken Language Understanding , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[51]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[52]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[53]  Zi Huang,et al.  Next Point-of-Interest Recommendation on Resource-Constrained Mobile Devices , 2020, WWW.

[54]  Anthony S. Maida,et al.  Deep Quaternion Networks , 2017, 2018 International Joint Conference on Neural Networks (IJCNN).

[55]  Zi Huang,et al.  Try This Instead: Personalized and Interpretable Substitute Recommendation , 2020, SIGIR.

[56]  Rui Yan,et al.  AIR: Attentional Intention-Aware Recommender Systems , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[57]  David Lo,et al.  Predicting response in mobile advertising with hierarchical importance-aware factorization machine , 2014, WSDM.