A Hierarchical Contextual Attention-based GRU Network for Sequential Recommendation

Sequential recommendation is one of fundamental tasks for Web applications. Previous methods are mostly based on Markov chains with a strong Markov assumption. Recently, recurrent neural networks (RNNs) are getting more and more popular and has demonstrated its effectiveness in many tasks. The last hidden state is usually applied as the sequence's representation to make recommendation. Benefit from the natural characteristics of RNN, the hidden state is a combination of long-term dependency and short-term interest to some degrees. However, the monotonic temporal dependency of RNN impairs the user's short-term interest. Consequently, the hidden state is not sufficient to reflect the user's final interest. In this work, to deal with this problem, we propose a Hierarchical Contextual Attention-based GRU (HCA-GRU) network. The first level of HCA-GRU is conducted on the input. We construct a contextual input by using several recent inputs based on the attention mechanism. This can model the complicated correlations among recent items and strengthen the hidden state. The second level is executed on the hidden state. We fuse the current hidden state and a contextual hidden state built by the attention mechanism, which leads to a more suitable user's overall interest. Experiments on two real-world datasets show that HCA-GRU can effectively generate the personalized ranking list and achieve significant improvement.

[1]  Yoshua Bengio,et al.  Z-Forcing: Training Stochastic Recurrent Networks , 2017, NIPS.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[4]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[5]  Razvan Pascanu,et al.  Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.

[6]  Li Zhao,et al.  Attention-based LSTM for Aspect-level Sentiment Classification , 2016, EMNLP.

[7]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[8]  Inchul Song,et al.  RNNDROP: A novel dropout for RNNS in ASR , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[9]  Pengfei Wang,et al.  Learning Hierarchical Representation Model for NextBasket Recommendation , 2015, SIGIR.

[10]  Liang Wang,et al.  Context-Aware Sequential Recommendation , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Lars Schmidt-Thieme,et al.  Factorizing personalized Markov chains for next-basket recommendation , 2010, WWW '10.

[13]  Martha Larson,et al.  TFMAP: optimizing MAP for top-n context-aware recommendation , 2012, SIGIR '12.

[14]  Yoshua Bengio,et al.  Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations , 2016, ICLR.

[15]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[16]  Zhongfei Zhang,et al.  DeepIntent: Learning Attentions for Online Advertising with Recurrent Neural Networks , 2016, KDD.

[17]  Jianmin Wang,et al.  A Personalized Interest-Forgetting Markov Model for Recommendations , 2015, AAAI.

[18]  Julian J. McAuley,et al.  Fusing Similarity Models with Markov Chains for Sparse Sequential Recommendation , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[19]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[20]  Christopher Kermorvant,et al.  Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[21]  Ke Wang,et al.  Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding , 2018, WSDM.

[22]  Qiang Liu,et al.  MV-RNN: A Multi-View Recurrent Neural Network for Sequential Recommendation , 2016, IEEE Transactions on Knowledge and Data Engineering.

[23]  Liang Wang,et al.  Multi-Behavioral Sequential Prediction with Recurrent Log-Bilinear Model , 2016, IEEE Transactions on Knowledge and Data Engineering.

[24]  Jia Li,et al.  Latent Cross: Making Use of Context in Recurrent Recommender Systems , 2018, WSDM.

[25]  David S. Rosenblum,et al.  Context-aware mobile music recommendation for daily activities , 2012, ACM Multimedia.

[26]  Garrison W. Cottrell,et al.  A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction , 2017, IJCAI.

[27]  Xiangliang Zhang,et al.  Multi-Order Attentive Ranking Model for Sequential Recommendation , 2019, AAAI.

[28]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[29]  Alexandros Karatzoglou,et al.  Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations , 2016, RecSys.

[30]  Gang Wang,et al.  Episodic CAMN: Contextual Attention-Based Memory Networks with Iterative Feedback for Scene Labeling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[32]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[33]  Tieniu Tan,et al.  Predicting the Next Location: A Recurrent Model with Spatial and Temporal Contexts , 2016, AAAI.

[34]  Xiaogang Wang,et al.  Crafting GBD-Net for Object Detection , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Julian J. McAuley,et al.  VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback , 2015, AAAI.

[36]  Ole Winther,et al.  Sequential Neural Models with Stochastic Layers , 2016, NIPS.

[37]  Jun Wang,et al.  Stochastic switched sampled-data control for synchronization of delayed chaotic neural networks with packet dropout , 2018, Appl. Math. Comput..

[38]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[39]  Geoffrey E. Hinton,et al.  Three new graphical models for statistical language modelling , 2007, ICML '07.

[40]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[41]  Gang Wang,et al.  Global Context-Aware Attention LSTM Networks for 3D Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[43]  Liang Wang,et al.  A Visual and Textual Recurrent Neural Network for Sequential Prediction , 2016, ArXiv.

[44]  Robin Burke,et al.  Context-aware music recommendation based on latenttopic sequential patterns , 2012, RecSys.

[45]  Feng Yu,et al.  A Dynamic Recurrent Model for Next Basket Recommendation , 2016, SIGIR.

[46]  Yuan Yan Tang,et al.  Nonfragile asynchronous control for uncertain chaotic Lurie network systems with Bernoulli stochastic process , 2018 .

[47]  Erhardt Barth,et al.  Recurrent Dropout without Memory Loss , 2016, COLING.