论文信息 - A Stream-based Resource for Multi-Dimensional Evaluation of Recommender Algorithms

A Stream-based Resource for Multi-Dimensional Evaluation of Recommender Algorithms

Recommender System research has evolved to focus on developing algorithms capable of high performance in online systems. This development calls for a new evaluation infrastructure that supports multi-dimensional evaluation of recommender systems. Today's researchers should analyze algorithms with respect to a variety of aspects including predictive performance and scalability. Researchers need to subject algorithms to realistic conditions in online A/B tests. We introduce two resources supporting such evaluation methodologies: the new data set of stream recommendation interactions released for CLEF NewsREEL 2017, and the new Open Recommendation Platform (ORP). The data set allows researchers to study a stream recommendation problem closely by "replaying" it locally, and ORP makes it possible to take this evaluation "live" in a living lab scenario. Specifically, ORP allows researchers to deploy their algorithms in a live stream to carry out A/B tests. To our knowledge, NewsREEL is the first online news recommender system resource to be put at the disposal of the research community. In order to encourage others to develop comparable resources for a wide range of domains, we present a list of practical lessons learned in the development of the dataset and ORP.

[1] Tetsuya Sakai,et al. Statistical Significance, Power, and Sample Sizes: A Systematic Review of SIGIR and TOIS, 2006-2015 , 2016, SIGIR.

[2] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.

[3] Martha Larson,et al. Benchmarking News Recommendations: The CLEF NewsREEL Use Case , 2016, SIGF.

[4] Jöran Beel,et al. Towards reproducibility in recommender-systems research , 2016, User Modeling and User-Adapted Interaction.

[5] Thorsten Joachims,et al. Counterfactual Evaluation and Learning for Search, Recommendation and Ad Placement , 2016, SIGIR.

[6] James Bennett,et al. The Netflix Prize , 2007 .

[7] Frank Hopfgartner,et al. Shedding light on a living lab: the CLEF NEWSREEL open recommendation platform , 2014, IIiX.

[8] Boi Faltings,et al. Predicting Online Performance of News Recommender Systems Through Richer Evaluation Metrics , 2015, RecSys.