An MDP-Based Recommender System

Typical Recommender systems adopt a static view of the recommendation process and treat it as a prediction problem. We argue that it is more appropriate to view the problem of generating recommendations as a sequential decision problem and, consequently, that Markov decision processes (MDP) provide a more appropriate model for Recommender systems. MDPs introduce two benefits: they take into account the long-term effects of each recommendation, and they take into account the expected value of each recommendation. To succeed in practice, an MDP-based Recommender system must employ a strong initial model; and the bulk of this paper is concerned with the generation of such a model. In particular, we suggest the use of an n-gram predictive model for generating the initial MDP. Our n-gram model induces a Markovchain model of user behavior whose predictive accuracy is greater than that of existing predictive models. We describe our predictive model in detail and evaluate its performance on real data. In addition, we show how the model can be used in an MDP-based Recommender system.

[1]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[2]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[3]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[4]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[5]  Andrew McCallum,et al.  Reinforcement learning with selective perception and hidden state , 1996 .

[6]  Craig Boutilier,et al.  Rewarding Behaviors , 1996, AAAI/IAAI, Vol. 2.

[7]  Combining Content-based and Collaborative Recommendation Diierent Approaches to Recommendation Content-based Recommendation , 1997 .

[8]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[9]  Paul Resnick,et al.  Recommender systems , 1997, CACM.

[10]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[11]  John Riedl,et al.  Combining Collaborative Filtering with Personal Agents for Better Recommendations , 1999, AAAI/IAAI.

[12]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[13]  Loriene Roy,et al.  Content-based book recommending using learning for text categorization , 1999, DL '00.

[14]  Daphne Koller,et al.  Policy Iteration for Factored MDPs , 2000, UAI.

[15]  L. Blair A prediction. , 1995, Hospitals & health networks.

[16]  David Maxwell Chickering,et al.  Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..

[17]  John Riedl,et al.  Analysis of recommendation algorithms for e-commerce , 2000, EC '00.

[18]  Craig Boutilier,et al.  Stochastic dynamic programming with factored representations , 2000, Artif. Intell..

[19]  Alexander S. Yeh,et al.  More accurate tests for the statistical significance of result differences , 2000, COLING.

[20]  John Riedl,et al.  Application of Dimensionality Reduction in Recommender System - A Case Study , 2000 .

[21]  Qiang Yang,et al.  A prediction system for multimedia pre-fetching in Internet , 2000, ACM Multimedia.

[22]  Brendan Kitts,et al.  Cross-sell: a fast promotion-tunable customer-item recommendation method based on conditionally independent probabilities , 2000, KDD '00.

[23]  David Maxwell Chickering,et al.  Using Temporal Data for Making Recommendations , 2001, UAI.

[24]  Anthony Jameson,et al.  When policies are better than plans: decision-theoretic planning of recommendation sequences , 2001, IUI '01.

[25]  Mark Claypool,et al.  Implicit interest indicators , 2001, IUI '01.

[26]  John K. Slaney,et al.  Anytime State-Based Solution Methods for Decision Processes with non-Markovian Rewards , 2002, UAI.

[27]  Craig Boutilier,et al.  A POMDP formulation of preference elicitation problems , 2002, AAAI/IAAI.

[28]  David Heckerman,et al.  CFW: A Collaborative Filtering System Using Posteriors over Weights of Evidence , 2002, UAI.

[29]  Blai Bonet,et al.  Faster Heuristic Search Algorithms for Planning with Uncertainty and Full Feedback , 2003, IJCAI.

[30]  Robin D. Burke,et al.  Hybrid Recommender Systems: Survey and Experiments , 2002, User Modeling and User-Adapted Interaction.

[31]  John Riedl,et al.  E-Commerce Recommendation Applications , 2004, Data Mining and Knowledge Discovery.