How One Word Can Make all the Difference - Using Subject Metadata for Automatic Query Expansion and Reformulation

Query enhancement with domain-specific metadata (thesaurus terms) is analyzed for monolingual and bilingual retrieval on the GIRT social science collection. We describe our technique of Entry Vocabulary Modules, which associates query words with thesaurus terms and suggest its use for monolingual as well as bilingual retrieval. Different weighting and merging schemes for adding keywords to queries as well as translation techniques are described. Query enhancement generally improves average precision scores for both monolingual and bilingual retrieval. We take a closer look at individual queries and discuss how the query enhancements (or substitutions in bilingual retrieval) can change retrieval results quite dramatically. A query-by-query analysis provides deeper insight into strengths and weaknesses of strategies and serves as a cautionary reminder that average precision scores don’t always tell the whole story.

[1]  Mark Sanderson,et al.  A Study of User Interaction with a Concept-Based Interactive Query Expansion Support Tool , 2004, ECIR.

[2]  Jaana Kekäläinen,et al.  Ontology as a Search-Tool: A Study of Real Users' Query Formulation With and Without Conceptual Support , 2005, ECIR.

[3]  Fredric C. Gey,et al.  UC Berkeley at CLEF 2003 - Russian Language Experiments and Domain-Specific Cross-Language Retrieval , 2003, CLEF.

[4]  Tamas E. Doszkocs,et al.  An Associative Semantic Network for Machine-Aided Indexing, Classification and Searching , 1992 .

[5]  Pertti Vakkari,et al.  Subject knowledge improves interactive query expansion assisted by a thesaurus , 2004, J. Documentation.

[6]  Fredric C. Gey,et al.  Full Text Retrieval based on Probalistic Equations with Coefficients fitted by Logistic Regression , 1993, TREC.

[7]  Susan Gauch,et al.  An Expert System for Automatic Query Reformation , 1993, J. Am. Soc. Inf. Sci..

[8]  Fredric C. Gey,et al.  Multilingual Information Retrieval Using Machine Translation, Relevance Feedback and Decompounding , 2004, Information Retrieval.

[9]  Stephen E. Robertson,et al.  Interactive Thesaurus Navigation: Intelligence Rules OK? , 1995, J. Am. Soc. Inf. Sci..

[10]  Christian Plaunt,et al.  An Association-Based Method for Automatic Indexing with a Controlled Vocabulary , 1998, J. Am. Soc. Inf. Sci..

[11]  Aitao Chen,et al.  Cross-language Retrieval Experiments at CLEF 2002 , 2002, CLEF.

[12]  Michael Kluck The GIRT Data in the Evaluation of CLIR Systems - from 1997 Until 2003 , 2003, CLEF.

[13]  Crawford Revie,et al.  Thesaurus-enhanced search interfaces , 2002, J. Inf. Sci..