Sequential Matrix Completion

We propose a novel algorithm for sequential matrix completion in a recommender system setting, where the $(i,j)$th entry of the matrix corresponds to a user $i$'s rating of product $j$. The objective of the algorithm is to provide a sequential policy for user-product pair recommendation which will yield the highest possible ratings after a finite time horizon. The algorithm uses a Gamma process factor model with two posterior-focused bandit policies, Thompson Sampling and Information-Directed Sampling. While Thompson Sampling shows competitive performance in simulations, state-of-the-art performance is obtained from Information-Directed Sampling, which makes its recommendations based off a ratio between the expected reward and a measure of information gain. To our knowledge, this is the first implementation of Information Directed Sampling on large real datasets. This approach contributes to a recent line of research on bandit approaches to collaborative filtering including Kawale et al. (2015), Li et al. (2010), Bresler et al. (2014), Li et al. (2016), Deshpande & Montanari (2012), and Zhao et al. (2013). The setting of this paper, as has been noted in Kawale et al. (2015) and Zhao et al. (2013), presents significant challenges to bounding regret after finite horizons. We discuss these challenges in relation to simpler models for bandits with side information, such as linear or gaussian process bandits, and hope the experiments presented here motivate further research toward theoretical guarantees.

[1]  Kenneth Y. Goldberg,et al.  Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.

[2]  Andrea Montanari,et al.  Linear bandits in high dimension and recommendation systems , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[3]  Arnaud Doucet,et al.  A Note on E ¢ cient Conditional Simulation of Gaussian Distributions , 2010 .

[4]  Jun Wang,et al.  Interactive collaborative filtering , 2013, CIKM.

[5]  Benjamin Van Roy,et al.  Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..

[6]  H. Robbins,et al.  Asymptotically efficient adaptive allocation rules , 1985 .

[7]  Tim Salimans,et al.  Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression , 2012, ArXiv.

[8]  Benjamin Van Roy,et al.  Learning to Optimize via Information-Directed Sampling , 2014, NIPS.

[9]  R. Weber On the Gittins Index for Multiarmed Bandits , 1992 .

[10]  Shuai Li,et al.  Collaborative Filtering Bandits , 2015, SIGIR.

[11]  David A. Knowles Stochastic gradient variational Bayes for gamma approximating distributions , 2015, 1509.01631.

[12]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[13]  Long Tran-Thanh,et al.  Efficient Thompson Sampling for Online Matrix-Factorization Recommendation , 2015, NIPS.

[14]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[15]  Karim Lounici High-dimensional covariance matrix estimation with missing observations , 2012, 1201.2577.

[16]  Shipra Agrawal,et al.  Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.

[17]  Devavrat Shah,et al.  A Latent Source Model for Online Collaborative Filtering , 2014, NIPS.

[18]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[19]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, ISIT.

[20]  Rémi Munos,et al.  Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.

[21]  Andrea Montanari,et al.  Low-rank matrix completion with noisy observations: A quantitative comparison , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  O. Klopp Noisy low-rank matrix completion with general sampling distribution , 2012, 1203.0108.

[23]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[24]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[25]  Franz J. Király,et al.  The algebraic combinatorial approach for low-rank matrix completion , 2012, J. Mach. Learn. Res..

[26]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[27]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[28]  Nigel Boston,et al.  A characterization of deterministic sampling patterns for low-rank matrix completion , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).