Deep density networks and uncertainty in recommender systems

Building robust online content recommendation systems requires learning complex interactions between user preferences and content features. The field has evolved rapidly in recent years from traditional multi-arm bandit and collaborative filtering techniques, with new methods employing Deep Learning models to capture non-linearities. Despite progress, the dynamic nature of online recommendations still poses great challenges, such as finding the delicate balance between exploration and exploitation. In this paper we show how uncertainty estimations can be incorporated by employing them in an optimistic exploitation/exploration strategy for more efficient exploration of new recommendations. We provide a novel hybrid deep neural network model, Deep Density Networks (DDN), which integrates content-based deep learning models with a collaborative scheme that is able to robustly model and estimate uncertainty. Finally, we present online and offline results after incorporating DNN into a real world content recommendation system that serves billions of recommendations per day, and show the benefit of using DDN in practice.

[1]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[2]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[3]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[4]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[5]  Heng-Tze Cheng,et al.  Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[6]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[7]  Lina Yao,et al.  Deep Learning Based Recommender System , 2017, ACM Comput. Surv..

[8]  Julien Cornebise,et al.  Weight Uncertainty in Neural Networks , 2015, ArXiv.

[9]  Yukihiro Tagami,et al.  Embedding-based News Recommendation for Millions of Users , 2017, KDD.

[10]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[11]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[12]  Geoffrey E. Hinton,et al.  Learning to Label Aerial Images from Noisy Data , 2012, ICML.

[13]  Nikolay Laptev,et al.  Deep and Confident Prediction for Time Series at Uber , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[14]  Jacob Goldberger,et al.  Training deep neural-networks using a noise adaptation layer , 2016, ICLR.

[15]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[16]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[17]  C. Bishop Mixture density networks , 1994 .

[18]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[20]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[21]  Wei Li,et al.  Exploitation and exploration in a performance based contextual advertising system , 2010, KDD.