The MultiText system retrieves passages, rather than entire documents, that are likely to be relevant to a particular topic. For all runs, we used the reciprocal of the length of each passage as an estimate of its likely relevance and ranked accordingly. For the manual adhoc task we explored the limits of user interaction by judging some 13,000 documents based on retrieved passages. For the automatic adhoc task we used retrieved passages as a feedback source for new query terms. For the routing task we estimated probability of relevance from passage length and used this estimate to construct a compound (tiered) query which was used to rank the new data using passage length. For the Chinese track we indexed individual characters rather than segmented words or bigrams and used manually constructed queries and passage-length ranking. For the high precision track we performed judgements on passages using an interface similar to that used for the manual adhoc task. The Very Large Collection run was done on a network of four cheap computers using very simple manually constructed queries and passage-length ranking.
[1]
Zimin Wu,et al.
Chinese Text Segmentation for Text Retrieval: Achievements and Problems
,
1993,
J. Am. Soc. Inf. Sci..
[2]
Charles L. A. Clarke,et al.
Schema-Independent Retrieval from Heterogeneous Structured Text
,
1994
.
[3]
Stephen E. Robertson,et al.
GatfordCentre for Interactive Systems ResearchDepartment of Information
,
1996
.
[4]
Ron Sacks-Davis,et al.
Similarity Measures for Short Queries
,
1995,
TREC.
[5]
W. Bruce Croft,et al.
Providing Government Information on the Internet: Experiences with THOMAS
,
1995,
DL.
[6]
Daniel E. Rose,et al.
V-Twin: A Lightweight Engine for Interactive Use
,
1996,
TREC.
[7]
Charles L. A. Clarke,et al.
Relevance ranking for one to three term queries
,
1997,
Inf. Process. Manag..