Deriving Very Short Queries for High Precision and Recall (MultiText Experiments for TREC-7)
暂无分享,去创建一个
The main aim of the MultiText experiments for TREC-7 was to derive very short queries that would yield high precision and recall, using a hybrid of manual and automatic processes. Identical queries were formulated for adhoc and VLC runs. A query set derived automatically from the topic title words, with an average of 2.84 terms per query, achieved a reasonable but unexceptional average precision for the adhoc task and a median precision @20 for the 100 GB VLC task. However, these short queries achieved very fast retrieval times | less than 1 second per query over 100 GB using four inexpensive PCs. Two further query sets were derived using post-processing of the results of interactive searching on the adhoc corpus. Queries comprising a single conjunction, averaging 1.86 terms, achieved high precision on both adhoc and VLC tasks, and achieved faster retrieval times than the title-word queries. Compound queries averaging 6.42 terms achieved precision values competitive with the best runs, and retrieval times of 1.51 seconds per query on the 100 GB VLC corpus.
[1] Charles L. A. Clarke,et al. Interactive Substring Retrieval (MultiText Experiments for TREC-5) , 1996, TREC.
[2] Charles L. A. Clarke,et al. Relevance ranking for one to three term queries , 1997, Inf. Process. Manag..
[3] Charles L. A. Clarke,et al. E cient Construction of Large Test , 1998 .
[4] Charles L. A. Clarke,et al. Shortest Substring Ranking (MultiText Experiments for TREC-4) , 1995, TREC.