Incremental Relevance Feedback in Japanese Text Retrieval

The application of relevance feedback techniques has been shown to improve retrieval performance for a number of information retrieval tasks. This paper explores incremental relevance feedback for ad hoc Japanese text retrieval; examining, separately and in combination, the utility of term reweighting and query expansion using a probabilistic retrieval model. Retrieval performance is evaluated in terms of standard precision-recall measures, and also using “number-to-view” graphs. Experimental results, on the standard BMIR-J2 Japanese language retrieval collection, show that both term reweighting and query expansion improve retrieval performance. This is reflected in improvements in both precision and recall, but also a reduction in the average number of documents which must be viewed to find a selected number of relevant items. In particular, using a simple simulation of user searching, incremental application of relevance information is shown to lead to progressively improved retrieval performance and an overall reduction in the number of documents that a user must view to find relevant ones.

[1]  Stephen E. Robertson,et al.  Okapi at TREC-5 , 1996, TREC.

[2]  E. A. Fox,et al.  Combining the Evidence of Multiple Query Representations for Information Retrieval , 1995, Inf. Process. Manag..

[3]  Yasushi Ogawa,et al.  A new character-based indexing method using frequency data for Japanese documents , 1995, SIGIR '95.

[4]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[5]  Stephen E. Robertson,et al.  On Term Selection for Query Expansion , 1991, J. Documentation.

[6]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[7]  Stephen E. Robertson,et al.  On relevance weights with little relevance information , 1997, SIGIR '97.

[8]  Jeong Soo Ahn,et al.  Using n-grams for Korean text retrieval , 1996, SIGIR '96.

[9]  Sakai Tetsuya,et al.  First Experiments on the BMIR - J2 Collection using the NEAT System , 1998 .

[10]  Karen Spärck Jones Search Term Relevance Weighting given Little Relevance Information , 1997, J. Documentation.

[11]  Lee-Feng Chien Fast and quasi-natural language search for gigabytes of Chinese texts , 1995, SIGIR '95.

[12]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[13]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[14]  Jian-Yun Nie,et al.  On Chinese text retrieval , 1996, SIGIR '96.

[15]  Toru Matsuda,et al.  Overlapping statistical word indexing: a new indexing method for Japanese text , 1997, SIGIR '97.

[16]  Stephen E. Robertson,et al.  Okapi at TREC-4 , 1995, TREC.

[17]  Gareth J. F. Jones,et al.  Experiments in Japanese text retrieval and routing using the NEAT system , 1998, SIGIR '98.

[18]  W. Bruce Croft,et al.  A comparison of indexing techniques for Japanese text retrieval , 1993, SIGIR.

[19]  Mark D. Dunlop Time, relevance and interaction modelling for information retrieval , 1997, SIGIR '97.

[20]  James Allan,et al.  Incremental relevance feedback for information filtering , 1996, SIGIR '96.

[21]  Tetsuya Sakai,et al.  Application of Query Expansion Techniques in Probabilistic Japanese News Filtering , 1998 .

[22]  K. Sparck Jones,et al.  Simple, proven approaches to text retrieval , 1994 .

[23]  Tetsuya Sakai,et al.  Lessons from BMIR-J2: a test collection for Japanese IR systems , 1998, SIGIR '98.

[24]  Ogawa Yasushi,et al.  A new character-based indexing method using frequency data for Japanese documents , 1995, SIGIR 1995.

[25]  W. S. Cooper Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems , 1968 .