论文信息 - PyRecGym: a reinforcement learning gym for recommender systems

PyRecGym: a reinforcement learning gym for recommender systems

Recommender systems (RS) share many features and objectives with reinforcement learning (RL) systems. The former aim to maximise user satisfaction by recommending the right items to the right users at the right time, the latter maximise future rewards by selecting state-changing actions in some environment. The concept of an RL gym has become increasingly important when it comes to supporting the development of RL models. A gym provides a simulation environment in which to test and develop RL agents, providing a state model, actions, rewards/penalties etc. In this paper we describe and demonstrate the PyRecGym gym, which is specifically designed for the needs of recommender systems research, by supporting standard test datasets (MovieLens, Yelp etc.), common input types (text, numeric etc.), and thereby offering researchers a reproducible research environment to accelerate experimentation and development of RL in RS.

[1] Jianhui Chen,et al. Efficient Ordered Combinatorial Semi-Bandits for Whole-Page Recommendation , 2017, AAAI.

[2] Jung-Woo Ha,et al. Reinforcement Learning based Recommender System using Biclustering Technique , 2018, ArXiv.

[3] David Cortes,et al. Adapting multi-armed bandits policies to contextual bandits scenarios , 2018, ArXiv.

[4] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.

[5] Jiliang Tang,et al. Reinforcement Learning for Online Information Seeking , 2018, ArXiv.

[6] Jun Tan,et al. Stabilizing Reinforcement Learning in Dynamic Environment with Application to Online Recommendation , 2018, KDD.

[7] Nicholas Jing Yuan,et al. DRN: A Deep Reinforcement Learning Framework for News Recommendation , 2018, WWW.

[8] Qingyun Wu,et al. Learning Contextual Bandits in a Non-stationary Environment , 2018, SIGIR.

[9] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[10] Jiliang Tang,et al. Model-Based Reinforcement Learning for Whole-Chain Recommendations , 2019, ArXiv.

[11] Alexandros Karatzoglou,et al. RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising , 2018, ArXiv.