Several recent studies have found only a weak relationship between the performance of a retrieval system and the "success" achievable by human searchers. We hypothesize that searchers are successful precisely because they alter their behavior. To explore the possible causal relation between system performance and search behavior, we control system performance, hoping to elicit adaptive search behaviors. 36 subjects each completed 12 searches using either a standard system or one of two degraded systems. Using a general linear model, we isolate the main effect of system performance, by measuring and removing main effects due to searcher variation, topic difficulty, and the position of each search in the time series. We find that searchers using our degraded systems are as successful as those using the standard system, but that, in achieving this success, they alter their behavior in ways that could be measured, in real time, by a suitably instrumented system. Our findings suggest, quite generally, that some aspects of behavioral dynamics may provide unobtrusive indicators of system performance.
[1]
Paul Over,et al.
Comparing interactive information retrieval systems across sites: the TREC-6 interactive track matrix experiment
,
1998,
SIGIR '98.
[2]
Andrew Turpin,et al.
Why batch and user evaluations do not give the same results
,
2001,
SIGIR '01.
[3]
D. Kelly,et al.
ARDA Challenge Workshop 2004 An Investigation of Evaluation Metrics for Analytic Question Answering
,
2005
.
[4]
James Allan,et al.
When will information retrieval be "good enough"?
,
2005,
SIGIR '05.
[5]
Falk Scholer,et al.
User performance versus precision measures for simple search tasks
,
2006,
SIGIR.
[6]
Paul B. Kantor,et al.
Cross-Evaluation: A new model for information system evaluation
,
2006,
J. Assoc. Inf. Sci. Technol..
[7]
Nina Wacholder,et al.
A model for quantitative evaluation of an end-to-end question-answering system
,
2007,
J. Assoc. Inf. Sci. Technol..