Risk-Aware Recommender Systems

Context-Aware Recommender Systems can naturally be modelled as an exploration/exploitation trade-off (exr/exp) problem, where the system has to choose between maximizing its expected rewards dealing with its current knowledge (exploitation) and learning more about the unknown user’s preferences to improve its knowledge (exploration). This problem has been addressed by the reinforcement learning community but they do not consider the risk level of the current user’s situation, where it may be dangerous to recommend items the user may not desire in her current situation if the risk level is high. We introduce in this paper an algorithm named R-UCB that considers the risk level of the user’s situation to adaptively balance between exr and exp. The detailed analysis of the experimental results reveals several important discoveries in the exr/exp behaviour.

[1]  Dunja Mladenic,et al.  Text-learning and related intelligent agents: a survey , 1999, IEEE Intell. Syst..

[2]  Ralph Neuneier,et al.  Risk-Sensitive Reinforcement Learning , 1998, Machine Learning.

[3]  Véra Kůrková,et al.  Artificial Neural Networks - ICANN 2008 , 18th International Conference, Prague, Czech Republic, September 3-6, 2008, Proceedings, Part I , 2008, ICANN.

[4]  Alda Lopes Gançarski,et al.  A Contextual-Bandit Algorithm for Mobile Context-Aware Recommender System , 2012, ICONIP.

[5]  Frank Sehnke,et al.  Policy Gradients with Parameter-Based Exploration for Control , 2008, ICANN.

[6]  Joseph A. Cherian Investment science : David G. Luenberger, ISBN 0-19-510809-4 New York, 1998, price in US: US $70 , 1998 .

[7]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[8]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[9]  Wei Li,et al.  Exploitation and exploration in a performance based contextual advertising system , 2010, KDD.

[10]  Günther Palm,et al.  Robust Exploration/Exploitation Trade-Offs in Safety-Critical Applications , 2012 .

[11]  Fritz Wysotzki,et al.  Risk-Sensitive Reinforcement Learning Applied to Control under Constraints , 2005, J. Artif. Intell. Res..

[12]  R. Howard,et al.  Risk-Sensitive Markov Decision Processes , 1972 .

[13]  Alda Lopes Gançarski,et al.  Hybrid-ε-greedy for Mobile Context-Aware Recommender System , 2012, PAKDD.