论文信息 - Deriving Very Short Queries for High Precision and Recall (MultiText Experiments for TREC-7)

Deriving Very Short Queries for High Precision and Recall (MultiText Experiments for TREC-7)

The main aim of the MultiText experiments for TREC-7 was to derive very short queries that would yield high precision and recall, using a hybrid of manual and automatic processes. Identical queries were formulated for adhoc and VLC runs. A query set derived automatically from the topic title words, with an average of 2.84 terms per query, achieved a reasonable but unexceptional average precision for the adhoc task and a median precision @20 for the 100 GB VLC task. However, these short queries achieved very fast retrieval times | less than 1 second per query over 100 GB using four inexpensive PCs. Two further query sets were derived using post-processing of the results of interactive searching on the adhoc corpus. Queries comprising a single conjunction, averaging 1.86 terms, achieved high precision on both adhoc and VLC tasks, and achieved faster retrieval times than the title-word queries. Compound queries averaging 6.42 terms achieved precision values competitive with the best runs, and retrieval times of 1.51 seconds per query on the 100 GB VLC corpus.

Charles L. A. Clarke | Gordon V. Cormack | Christopher R. Palmer | Michael Van Biesbrouck

[1] Charles L. A. Clarke,et al. Interactive Substring Retrieval (MultiText Experiments for TREC-5) , 1996, TREC.

[2] Charles L. A. Clarke,et al. Relevance ranking for one to three term queries , 1997, Inf. Process. Manag..

[3] Charles L. A. Clarke,et al. E cient Construction of Large Test , 1998 .

[4] Charles L. A. Clarke,et al. Shortest Substring Ranking (MultiText Experiments for TREC-4) , 1995, TREC.