论文信息 - Whole-session evaluation of interactive information retrieval systems Compilation of Homework

Whole-session evaluation of interactive information retrieval systems Compilation of Homework

This paper points out a number of major challenges in the evaluation of Interactive IR. The main problems identified with current approaches include: (i) user/task models are not adequately captured, (ii) information continually changes over time, (iii) IIR tasks are often very complex and thus hard to model as they evolve, and may not have fixed endpoints, and (iv) IIR often occurs over time and across sessions. While the paper doesn't provide any concrete solutions to these problems, the most promising suggestion is the use of a "living laboratory". The development of such a living lab that is open to researchers would certainly provide a number of ways to evaluate users in the wild-over coming some of the pragmatic problems typically associated with evaluation. These two works suggest that we should focus on the sequence in which users experience, encounter and process information. Bookstein tries to model the retrieval process as a sequence in order to develop a better retrieval system (and is perhaps a precursor to the Interactive Probability Ranking Principle). On the other hand, Tague-Sutclifee tries to measure the informativeness of the process (where informativeness is akin to the novelty and diversity measures being developed). Key in these works is the focus on the order in which the users examine documents. This paper provides a novel an potentially interesting solution to evaluation across a session. In some respects this work blends developments in HCI with IR. Specifically taking a GOMS like approach by Card and Moran along with Dunlop's work on time, relevance and interaction modelling to produce a "probabilistic GOMS" for IR where the main actions in the search process are assigned a time, and a probability is assigned to these actions. This provides an interesting way to examine and explore a range of potential interactions with the system-as a way to cater for the variety of ways that users interact with systems. In terms of evaluating the whole session, I have been particularly interested in developing measures that examine how well a user uses an application. The fundamental idea is that, what should be evaluated is the sequence of interactions and documents that the user examines and inspects during the process (i.e. following on from Bookstein and Tague-Sutcliffee, along with Norman's idea of the user experience is defined by the sequence of interactions.) The experience, whether it be, engagement, utility, fun, etc. at any particular point …

Susan Dumais | S. Dumais

[1] Doug Downey,et al. Understanding the relationship between searchers' queries and information goals , 2008, CIKM '08.

[2] Noriko Kando,et al. Using Concept Map to Evaluate Learning by Searching , 2012, CogSci.

[3] Robert Villa,et al. Interaction Pool: Towards a user-centred test collection , 2007 .

[4] Susan T. Dumais,et al. Evaluation Challenges and Directions for Information-Seeking Support Systems , 2009, Computer.

[5] Arjen P. de Vries,et al. Supporting children's web search in school environments , 2012, IIiX.

[6] Jacek Gwizdka,et al. Distribution of cognitive load in Web search , 2010, J. Assoc. Inf. Sci. Technol..

[7] Charles L. A. Clarke,et al. Time-based calibration of effectiveness measures , 2012, SIGIR '12.

[8] Yvonne Kammerer,et al. Signpost from the masses: learning effects in an exploratory social tag search browser , 2009, CHI.

[9] J. Schneider,et al. The Janus Faced Scholar: A Festschrift in Honour of Peter Ingwersen , 2010 .

[10] Kalervo Järvelin,et al. Time drives interaction: simulating sessions in diverse searching environments , 2012, SIGIR '12.

[11] Morten Hertzum,et al. Browsing and querying in online documentation: a study of user interfaces and the interaction process , 1996, TCHI.