Attention-based mixture density recurrent networks for history-based recommendation

Recommendation system has been widely used in search, online advertising, e-Commerce, etc. Most products and services can be formulated as a personalized recommendation problem. Based on users' past behavior, the goal of personalized history-based recommendation is to dynamically predict the user's propensity (online purchase, click, etc.) distribution over time given a sequence of previous activities. In this paper, with an e-Commerce use case, we present a novel and general recommendation approach that uses a recurrent network to summarize the history of users' past purchases, with a continuous vectors representing items, and an attention-based recurrent mixture density network, which outputs each mixture component dynamically, to accurate model the predictive distribution of future purchase. We evaluate the proposed approach on two publicly available datasets, MovieLens-20M and RecSys15. Both experiments show that the proposed approach, which explicitly models the multi-modal nature of the predictive distribution, is able to greatly improve the performance over various baselines in terms of precision, recall and nDCG. The new modeling framework proposed can be easily adopted to many domain-specific problems, such as item recommendation in e-Commerce, ads targeting in online advertising, click-through-rate modeling, etc.

[1]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[2]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[3]  Stephan Mandt,et al.  Dynamic Word Embeddings , 2017, ICML.

[4]  David A. McAllester,et al.  Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence , 2009, UAI 2009.

[5]  F. Maxwell Harper,et al.  The MovieLens Datasets: History and Context , 2016, TIIS.

[6]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7]  Chang Zhou,et al.  ATRank: An Attention-Based User Behavior Modeling Framework for Recommendation , 2017, AAAI.

[8]  Xiang Li,et al.  Perceive Your Users in Depth: Learning Universal User Representations from Multiple E-commerce Tasks , 2018, KDD.

[9]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[10]  Alexandros Karatzoglou,et al.  Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations , 2016, RecSys.

[11]  A. Raftery,et al.  The Mixture Transition Distribution Model for High-Order Markov Chains and Non-Gaussian Time Series , 2002 .

[12]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[13]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[14]  David M. Blei,et al.  Factorization Meets the Item Embedding: Regularizing Matrix Factorization with Item Co-occurrence , 2016, RecSys.

[15]  Oren Barkan,et al.  ITEM2VEC: Neural item embedding for collaborative filtering , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[16]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[17]  Alex Beutel,et al.  Recurrent Recommender Networks , 2017, WSDM.

[18]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[19]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[20]  Alexandros Karatzoglou,et al.  Session-based Recommendations with Recurrent Neural Networks , 2015, ICLR.

[21]  C. Bishop Mixture density networks , 1994 .

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Yong Liu,et al.  Improved Recurrent Neural Networks for Session-based Recommendations , 2016, DLRS@RecSys.

[24]  Thomas Demeester,et al.  Large-scale user modeling with recurrent neural networks for music discovery on multiple time scales , 2017, Multimedia Tools and Applications.

[25]  Stephanie Rogers,et al.  Related Pins at Pinterest: The Evolution of a Real-World Recommender System , 2017, WWW.

[26]  Yang Song,et al.  Multi-Rate Deep Learning for Temporal Recommendation , 2016, SIGIR.

[27]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[28]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[29]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[30]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[31]  Guorui Zhou,et al.  Deep Interest Network for Click-Through Rate Prediction , 2017, KDD.