Within-Document Retrieval: A User-Centred Evaluation of Relevance Profiling

We present a user-centred, task-oriented, comparative evaluation of two within-document retrieval tools. ProfileSkim computes a relevance profile for a document with respect to a query, and presents the profile as an interactive bar graph. FindSkim provides similar functionality to the web browser “Find” command. A novel simulated work task was devised, where participants are asked to identify (index) relevant pages of an electronic book, given topics from the existing book index. The original book index provides the ground truth, against which the indexing results of the participants can be compared. We confirmed a major hypothesis, namely ProfileSkim proved significantly more efficient than Find-Skim, as measured by time for task. The study indicates that ProfileSkim was as least as effective as FindSkim in identifying relevant pages, as measured by traditional information retrieval measures, and there is some evidence that ProfileSkim is a precision-enhancing tool. Based on qualitative data from questionnaires, we also provide strong evidence to support our conjecture that the participants would be more satisfied when using ProfileSkim than FindSkim. The experimental study confirmed the potential of relevance profiling for improving within-document retrieval. Relevance profiling should prove highly beneficial for users trying to identify relevant information within long documents.

[1]  Donna K. Harman,et al.  Overview of the First Text REtrieval Conference (TREC-1) , 1992, TREC.

[2]  Peter Ingwersen,et al.  The development of a method for the evaluation of interactive information retrieval systems , 1997, J. Documentation.

[3]  Stephen E. Robertson,et al.  Evaluating Interactive Systems in TREC , 1996, J. Am. Soc. Inf. Sci..

[4]  W. Bruce Croft,et al.  A general language model for information retrieval (poster abstract) , 1999, Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

[5]  Justin Zobel,et al.  Passage retrieval revisited , 1997, SIGIR '97.

[6]  W. Bruce Croft,et al.  A general language model for information retrieval , 1999, CIKM '99.

[7]  Paul Over,et al.  The TREC-9 Interactive Track Report , 1999, TREC.

[8]  Bill Broyles Notes , 1907, The Classical Review.

[9]  Ivan Koychev,et al.  Query-Based Document Skimming: A User-Centred Evaluation of Relevance Profiling , 2003, ECIR.

[10]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[11]  David J. Harper,et al.  A language modelling approach to relevance profiling for document browsing , 2002, JCDL '02.

[12]  Marti A. Hearst TileBars: visualization of term distribution information in full text information access , 1995, CHI '95.

[13]  Joemon M. Jose,et al.  Spatial querying for image retrieval: a user-oriented evaluation , 1998, SIGIR '98.

[14]  William R. Hersh,et al.  A task-oriented approach to information retrieval evaluation , 1996 .