论文信息 - Information Retrieval System Evaluation

Information Retrieval System Evaluation

1. Module name: Information Retrieval System Evaluation 2. Scope: The module introduces the evaluation in information retrieval. It focuses on the standard measurement of system effectiveness through relevance judgments. 3. Learning objectives Students should be able to: a. Understand relevance judgments and the techniques applied to evaluate unranked and ranked IR systems. b. Evaluate an IR system given document collection with information needs and interpret the results. 4. 5S characteristics of the module a. Streams: Relevance judgment results are " ground truths " of test collections. Lucene's " quality " package takes " ground truth " , queries and indexed search results as inputs to produce search quality results. b. Structures: The built-in formats of queries and topics in the Lucene benchmark is based off of the format of the TREC corpus. Documents in LucidWorks are indexed in XML format. c. Spaces: The ADI documents are located on the server running LucidWorks software. d. Scenarios: Relevance judgments are required to build test collections given document collection and information needs. Test collections are then used to evaluate information retrieval systems. This module should take at least 4 hours to complete. a. Out-of-class: students are expected to spend at least 4 hours to complete the module and exercises. Time should be spent reading the material from the textbook and Lucene chapters, as well as revisiting the relevant lecture [12e, 12f, 12d]. b. In-class: students will have the opportunity to ask and discuss exercises with their teammates. 6. Relationships with other modules (flow between modules) a. Overview of Lucidworks big data software module Lucidworks overview module introduces the software and provides instructions to learn. This module requires the Lucidworks software to perform the exercises.

[1] Jacques Savoy,et al. Statistical inference in retrieval effectiveness evaluation , 1997, Inf. Process. Manag..

[2] V Beattie,et al. The numbers don't add up , 2001 .

[3] Harold Borko,et al. Automatic indexing , 1981, ACM '81.

[4] Ellen M. Voorhees,et al. Evaluating Evaluation Measure Stability , 2000, SIGIR 2000.

[5] Ellen M. Voorhees,et al. Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[6] Justin Zobel,et al. How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.

[7] Ellen M. Voorhees,et al. The effect of topic set size on retrieval experiment error , 2002, SIGIR '02.

[8] C. J. van Rijsbergen,et al. Report on the need for and provision of an 'ideal' information retrieval test collection , 1975 .

[9] James Blustein,et al. A Statistical Analysis of the TREC-3 Data , 1995, TREC.

[10] David A. Hull. Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.