Rank and relevance in novelty and diversity metrics for recommender systems

The Recommender Systems community is paying increasing attention to novelty and diversity as key qualities beyond accuracy in real recommendation scenarios. Despite the raise of interest and work on the topic in recent years, we find that a clear common methodological and conceptual ground for the evaluation of these dimensions is still to be consolidated. Different evaluation metrics have been reported in the literature but the precise relation, distinction or equivalence between them has not been explicitly studied. Furthermore, the metrics reported so far miss important properties such as taking into consideration the ranking of recommended items, or whether items are relevant or not, when assessing the novelty and diversity of recommendations. We present a formal framework for the definition of novelty and diversity metrics that unifies and generalizes several state of the art metrics. We identify three essential ground concepts at the roots of novelty and diversity: choice, discovery and relevance, upon which the framework is built. Item rank and relevance are introduced through a probabilistic recommendation browsing model, building upon the same three basic concepts. Based on the combination of ground elements, and the assumptions of the browsing model, different metrics and variants unfold. We report experimental observations which validate and illustrate the properties of the proposed metrics.

[1]  Òscar Celma,et al.  A new approach to evaluating novel recommendations , 2008, RecSys '08.

[2]  Licia Capra,et al.  Temporal diversity in recommender systems , 2010, SIGIR.

[3]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.

[4]  Yuchen Zhang,et al.  Characterizing search intent diversity into click models , 2011, WWW.

[5]  Ben Carterette,et al.  System effectiveness, user models, and user utility: a conceptual framework for investigation , 2011, SIGIR.

[6]  Alistair Moffat,et al.  Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.

[7]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[8]  Gediminas Adomavicius,et al.  Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques , 2012, IEEE Transactions on Knowledge and Data Engineering.

[9]  Kartik Hosanagar,et al.  Blockbuster Culture's Next Rise or Fall: The Impact of Recommender Systems on Sales Diversity , 2007, Manag. Sci..

[10]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[11]  Yi-Cheng Zhang,et al.  Solving the apparent diversity-accuracy dilemma of recommender systems , 2008, Proceedings of the National Academy of Sciences.

[12]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[13]  Sean M. McNee,et al.  Being accurate is not enough: how accuracy metrics have hurt recommender systems , 2006, CHI Extended Abstracts.

[14]  Charles L. A. Clarke,et al.  A comparative analysis of cascade measures for novelty and diversity , 2011, WSDM '11.

[15]  Filip Radlinski,et al.  Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[16]  Pablo Castells,et al.  A study of heterogeneity in recommendations for a social music service , 2010, HetRec '10.

[17]  Sean M. McNee,et al.  Improving recommendation lists through topic diversification , 2005, WWW '05.

[18]  Mi Zhang,et al.  Avoiding monotony: improving the diversity of recommendation lists , 2008, RecSys '08.

[19]  Charles L. A. Clarke,et al.  Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[20]  Saul Vargas,et al.  Intent-oriented diversity in recommender systems , 2011, SIGIR.

[21]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[22]  Olivier Chapelle,et al.  Expected reciprocal rank for graded relevance , 2009, CIKM.