Looking for "Good" Recommendations: A Comparative Evaluation of Recommender Systems

A number of researches in the Recommender Systems (RSs) domain suggest that the recommendations that are "best" according to objective metrics are sometimes not the ones that are most satisfactory or useful to the users. The paper investigates the quality of RSs from a user-centric perspective. We discuss an empirical study that involved 210 users and considered seven RSs on the same dataset that use different baseline and state-of-the-art recommendation algorithms. We measured the user's perceived quality of each of them, focusing on accuracy and novelty of recommended items, and on overall users' satisfaction. We ranked the considered recommenders with respect to these attributes, and compared these results against measures of statistical quality of the considered algorithms as they have been assessed by past studies in the field using information retrieval and machine learning algorithms.

[1]  Pearl Pu,et al.  User Technology Adoption Issues in Recommender Systems , 2007 .

[2]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[3]  Li Chen,et al.  A user-centric evaluation framework for recommender systems , 2011, RecSys '11.

[4]  Yi Zhang,et al.  Novelty and redundancy detection in adaptive filtering , 2002, SIGIR '02.

[5]  Li Chen,et al.  Evaluating product search and recommender systems for E-commerce environments , 2008, Electron. Commer. Res..

[6]  Roberto Turrin,et al.  Performance of recommender algorithms on top-n recommendation tasks , 2010, RecSys '10.

[7]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[8]  Sean M. McNee,et al.  Improving recommendation lists through topic diversification , 2005, WWW '05.

[9]  Domonkos Tikk,et al.  Scalable Collaborative Filtering Approaches for Large Recommender Systems , 2009, J. Mach. Learn. Res..

[10]  Roberto Turrin,et al.  Analysis of cold-start recommendations in IPTV systems , 2009, RecSys '09.

[11]  Adam W. Shearer User response to two algorithms as a test of collaborative filtering , 2001, CHI Extended Abstracts.

[12]  Sean M. McNee,et al.  Being accurate is not enough: how accuracy metrics have hurt recommender systems , 2006, CHI Extended Abstracts.

[13]  Richi Nayak,et al.  Improving Recommendation Novelty Based on Topic Taxonomy , 2007, 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops.

[14]  Pearl Pu,et al.  Critiquing recommenders for public taste products , 2009, RecSys '09.

[15]  Li Chen,et al.  Trust building with explanation interfaces , 2006, IUI '06.

[16]  Li Chen,et al.  A cross-cultural user evaluation of product recommender interfaces , 2008, RecSys '08.

[17]  Kartik Hosanagar,et al.  Blockbuster Culture's Next Rise or Fall: The Impact of Recommender Systems on Sales Diversity , 2007, Manag. Sci..

[18]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[19]  Vijay V. Raghavan,et al.  A critical investigation of recall and precision as measures of retrieval system performance , 1989, TOIS.

[20]  RaghavanVijay,et al.  A critical investigation of recall and precision as measures of retrieval system performance , 1989 .

[21]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[22]  Òscar Celma,et al.  A new approach to evaluating novel recommendations , 2008, RecSys '08.

[23]  Chris Ding,et al.  On the Use of Singular Value Decomposition for Text Retrieval , 2000 .

[24]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[25]  Rong Hu,et al.  Acceptance issues of personality-based recommender systems , 2009, RecSys '09.