Query expansion with terms selected using lexical cohesion analysis of documents

We present new methods of query expansion using terms that form lexical cohesive links between the contexts of distinct query terms in documents (i.e., words surrounding the query terms in text). The link-forming terms (link-terms) and short snippets of text surrounding them are evaluated in both interactive and automatic query expansion (QE). We explore the effectiveness of snippets in providing context in interactive query expansion, compare query expansion from snippets vs. whole documents, and query expansion following snippet selection vs. full document relevance judgements. The evaluation, conducted on the HARD track data of TREC 2005, suggests that there are considerable advantages in using link-terms and their surrounding short text snippets in QE compared to terms selected from full-texts of documents.

[1]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[2]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[3]  Justin Zobel,et al.  Questioning Query Expansion: An Examination of Behaviour and Parameters , 2004, ADC.

[4]  Ryen W. White,et al.  Using top-ranking sentences to facilitate effective information access: Book Reviews , 2005 .

[5]  Stephen E. Robertson,et al.  On Term Selection for Query Expansion , 1991, J. Documentation.

[6]  Olga Vechtomova,et al.  Elicitation and use of relevance feedback information , 2006, Inf. Process. Manag..

[7]  Mark Sanderson,et al.  Advantages of query biased summaries in information retrieval , 1998, SIGIR '98.

[8]  James Allan,et al.  HARD Track Overview in TREC 2004 (Notebook) High Accuracy Retrieval from Documents , 2004 .

[9]  Gareth J. F. Jones,et al.  Applying summarization techniques for term selection in relevance feedback , 2001, SIGIR '01.

[10]  Ryen W. White,et al.  Using top-ranking sentences to facilitate effective information access , 2005, J. Assoc. Inf. Sci. Technol..

[11]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 2 , 2000, Inf. Process. Manag..

[12]  Michael Halliday,et al.  Cohesion in English , 1976 .

[13]  Stephen E. Robertson,et al.  On document relevance and lexical cohesion between query terms , 2006, Inf. Process. Manag..

[14]  Micheline Hancock-Beaulieu,et al.  Interactive searching and interface issues in the Okapi best match probabilistic retrieval system , 1998, Interact. Comput..

[15]  Olga Vechtomova,et al.  A Study of Document Relevance and Lexical Cohesion between Query Terms , 2005 .