Evaluating the effectiveness of explanations for recommender systems

When recommender systems present items, these can be accompanied by explanatory information. Such explanations can serve seven aims: effectiveness, satisfaction, transparency, scrutability, trust, persuasiveness, and efficiency. These aims can be incompatible, so any evaluation needs to state which aim is being investigated and use appropriate metrics. This paper focuses particularly on effectiveness (helping users to make good decisions) and its trade-off with satisfaction. It provides an overview of existing work on evaluating effectiveness and the metrics used. It also highlights the limitations of the existing effectiveness metrics, in particular the effects of under- and overestimation and recommendation domain. In addition to this methodological contribution, the paper presents four empirical studies in two domains: movies and cameras. These studies investigate the impact of personalizing simple feature-based explanations on effectiveness and satisfaction. Both approximated and real effectiveness is investigated. Contrary to expectation, personalization was detrimental to effectiveness, though it may improve user satisfaction. The studies also highlighted the importance of considering opt-out rates and the underlying rating distribution when evaluating effectiveness.

[1]  Sean M. McNee,et al.  Being accurate is not enough: how accuracy metrics have hurt recommender systems , 2006, CHI Extended Abstracts.

[2]  Judith Masthoff,et al.  Layered evaluation of interactive adaptive systems: framework and formative methods , 2010, User Modeling and User-Adapted Interaction.

[3]  Valerie J. Trifts,et al.  Consumer Decision Making in Online Shopping Environments: The Effects of Interactive Decision Aids , 2000 .

[4]  Alexander Felfernig,et al.  A Dominance Model for the Calculation of Decoy Products in Recommendation Environments , 2008 .

[5]  Barry Smyth,et al.  Incremental critiquing , 2005, Knowl. Based Syst..

[6]  Li Chen,et al.  Trust building with explanation interfaces , 2006, IUI '06.

[7]  Il Im,et al.  The impact of product category on customer dissatisfaction in cyberspace , 2003, Bus. Process. Manag. J..

[8]  Judith Masthoff,et al.  Evaluating recommender explanations: Problems experienced and lessons learned for the evaluation of adaptive systems , 2009 .

[9]  Barry Smyth,et al.  Thinking Positively - Explanatory Feedback for Conversational Recommender Systems , 2004 .

[10]  Sean M. McNee,et al.  Making recommendations better: an analytic model for human-recommender interaction , 2006, CHI Extended Abstracts.

[11]  Judith Masthoff,et al.  A Survey of Explanations in Recommender Systems , 2007, 2007 IEEE 23rd International Conference on Data Engineering Workshop.

[12]  Catholijn M. Jonker,et al.  Designing interfaces for explicit preference elicitation: a user-centered investigation of preference representation and elicitation process , 2011, User Modeling and User-Adapted Interaction.

[13]  Rashmi R. Sinha,et al.  The role of transparency in recommender systems , 2002, CHI Extended Abstracts.

[14]  John Riedl,et al.  Explaining collaborative filtering recommendations , 2000, CSCW '00.

[15]  Judy Kay,et al.  A Scrutable Adaptive Hypertext , 2002, AH.

[16]  A. Nanopoulos,et al.  Justified Recommendations based on Content and Rating Data , 2008 .

[17]  Li Chen,et al.  Hybrid critiquing-based recommender systems , 2007, IUI '07.

[18]  Mark Hingston,et al.  User Friendly Recommender Systems , 2006 .

[19]  Sean M. McNee,et al.  Getting to know you: learning new user preferences in recommender systems , 2002, IUI '02.

[20]  Judith Masthoff,et al.  Designing and Evaluating Explanations for Recommender Systems , 2011, Recommender Systems Handbook.

[21]  Li Chen,et al.  Evaluating recommender systems from the user’s perspective: survey of the state of the art , 2012, User Modeling and User-Adapted Interaction.

[22]  Peter Brusilovsky,et al.  Open user profiles for adaptive news systems: help or harm? , 2007, WWW '07.

[23]  John Riedl,et al.  Tagsplanations: explaining recommendations using tags , 2009, IUI.

[24]  Raymond J. Mooney,et al.  Explaining Recommendations: Satisfaction vs. Promotion , 2005 .

[25]  Judith Masthoff The evaluation of adaptive systems , 2003 .

[26]  Alexander Felfernig,et al.  User Acceptance of Knowledge-based Recommenders , 2008, Personalization Techniques and Recommender Systems.

[27]  Pontus Wärnestål,et al.  User Evaluation of a Conversational Recommender System , 2005, IJCAI 2005.

[28]  Barry Smyth,et al.  Experiments in dynamic critiquing , 2005, IUI.

[29]  Ido Guy,et al.  Personalized recommendation of social software items based on social relations , 2009, RecSys '09.

[30]  Liliana Ardissono,et al.  Intrigue: Personalized recommendation of tourist attractions for desktop and hand held devices , 2003, Appl. Artif. Intell..

[31]  David McSherry,et al.  Explanation in Recommender Systems , 2005, Artificial Intelligence Review.

[32]  Ido Guy,et al.  Do you know?: recommending people to invite into your social network , 2009, IUI.

[33]  Lora Aroyo,et al.  The effects of transparency on trust in and acceptance of a content-based art recommender , 2008, User Modeling and User-Adapted Interaction.

[34]  Izak Benbasat,et al.  Recommendation Agents for Electronic Commerce: Effects of Explanation Facilities on Trusting Beliefs , 2007, J. Manag. Inf. Syst..

[35]  Pat Langley,et al.  A Personalized System for Conversational Recommendations , 2011, J. Artif. Intell. Res..

[36]  Judith Masthoff,et al.  Over- and underestimation in different product domains , 2008, ECAI 2008.

[37]  Li Chen,et al.  Trust-inspiring explanation interfaces for recommender systems , 2007, Knowl. Based Syst..

[38]  D. Laband An Objective Measure of Search versus Experience Goods , 1991 .

[39]  Carl Shapiro,et al.  Optimal Pricing of Experience Goods , 1983 .

[40]  Judith Masthoff,et al.  Effective explanations of recommendations: user-centered design , 2007, RecSys '07.

[41]  Nava Tintarev The Effectiveness of Personalized Movie Explanations: An Experiment Using Commercial Meta-data , 2008, AH.

[42]  Johanna D. Moore,et al.  An Empirical Study of the Influence of User Tailoring on Evaluative Argument Effectiveness , 2001, IJCAI.

[43]  Lior Rokach,et al.  Recommender Systems Handbook , 2010 .

[44]  Barry Smyth,et al.  Generating Diverse Compound Critiques , 2005, Artificial Intelligence Review.

[45]  Vanessa Evers,et al.  The effects of transparency on perceived and actual competence of a content-based recommender , 2008 .

[46]  Michael J. Pazzani,et al.  A personal news agent that talks, learns and explains , 1999, AGENTS '99.

[47]  Karin M. Verspoor,et al.  Dynamic document delivery: generating natural language texts on demand , 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130).

[48]  Ben M. Enis,et al.  Classifying Products Strategically , 1986 .

[49]  Judith Masthoff,et al.  Group Modeling: Selecting a Sequence of Television Items to Suit a Group of Viewers , 2004, User Modeling and User-Adapted Interaction.