论文信息 - Discovering key concepts in verbose queries

Discovering key concepts in verbose queries

Current search engines do not, in general, perform well with longer, more verbose queries. One of the main issues in processing these queries is identifying the key concepts that will have the most impact on effectiveness. In this paper, we develop and evaluate a technique that uses query-dependent, corpus-dependent, and corpus-independent features for automatic extraction of key concepts from verbose queries. We show that our method achieves higher accuracy in the identification of key concepts than standard weighting methods such as inverse document frequency. Finally, we propose a probabilistic model for integrating the weighted key concepts identified by our method into a query, and demonstrate that this integration significantly improves retrieval effectiveness for a large set of natural language description queries derived from TREC topics on several newswire and web collections.

W. Bruce Croft | Michael Bendersky | Michael Bendersky

[1] Emanuele Pianta,et al. Beyond Lexical Units: Enriching WordNets with Phrasets , 2003, EACL.

[2] W. Bruce Croft,et al. A Markov random field model for term dependencies , 2005, SIGIR '05.

[3] Anette Hulth,et al. Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[4] James Allan,et al. INQUERY at TREC-5 , 1996, TREC.

[5] Claire Cardie,et al. Using clustering and SuperConcepts within SMART: TREC 6 , 1997, Inf. Process. Manag..

[6] W. Bruce Croft,et al. Indri: A language-model based search engine for complex queries1 , 2005 .

[7] Ian Witten,et al. Data Mining , 2000 .

[8] José Gabriel Pereira Lopes,et al. Document clustering and cluster topic extraction in multilingual corpora , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[9] W. Bruce Croft,et al. Query performance prediction in web search environments , 2007, SIGIR.

[10] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.

[11] Peter D. Turney. Learning Algorithms for Keyphrase Extraction , 2000, Information Retrieval.