Evaluating Contextual Suggestion

As its primary evaluation measure, the TREC 2012 Contextual Suggestion Track used precision@5. Unfortunately, this measure is not ideally suited to the task. The task in this track is dierent from IR systems where precision@5, and similar measures, could more readily be used. Track participants returned travel suggestions that included brief descriptions, where the availability of these descriptions allows users to quickly skip suggestions that are not of interest to them. A user’s reaction to a suggestion could be negative (\dislike"), as well as positive (\like") or neutral, and too many disliked suggestions may cause the user to abandon the results. Neither of these factors are handled appropriately by traditional evaluation methodologies for information retrieval and recommendation. Building on the time-biased gain framework of Smucker and Clarke, which recognizes time as a critical element in user modeling for evaluation, we propose a new evaluation measure that directly accommodates these factors.