论文信息 - Automatic boolean query suggestion for professional search - 字舞流文

Automatic boolean query suggestion for professional search

In professional search environments, such as patent search or legal search, search tasks have unique characteristics: 1) users interactively issue several queries for a topic, and 2) users are willing to examine many retrieval results, i.e., there is typically an emphasis on recall. Recent surveys have also verified that professional searchers continue to have a strong preference for Boolean queries because they provide a record of what documents were searched. To support this type of professional search, we propose a novel Boolean query suggestion technique. Specifically, we generate Boolean queries by exploiting decision trees learned from pseudo-labeled documents and rank the suggested queries using query quality predictors. We evaluate our algorithm in simulated patent and medical search environments. Compared with a recent effective query generation system, we demonstrate that our technique is effective and general.

W. Bruce Croft | Youngho Kim | Jangwon Seo | Youngho Kim | Jangwon Seo

[1] Richard Bache,et al. Improving Access to Large Patent Corpora , 2010, Trans. Large Scale Data Knowl. Centered Syst..

[2] W. Bruce Croft,et al. Learning to rank query reformulations , 2010, SIGIR '10.

[3] Chris Buckley,et al. OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[4] W. Bruce Croft,et al. Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[5] W. Bruce Croft,et al. Transforming patents into prior-art queries , 2009, SIGIR.

[6] Thorsten Joachims,et al. Optimizing search engines using clickthrough data , 2002, KDD.

[7] Stuart J. Russell,et al. Artificial Intelligence , 1986 .

[8] Patrick Ruch,et al. Report on the TREC 2009 Experiments: Chemical IR Track , 2009, TREC.

[9] ChengXiang Zhai,et al. Mining term association patterns from search logs for effective query reformulation , 2008, CIKM '08.

[10] W. Bruce Croft,et al. Indri : A language-model based search engine for complex queries ( extended version ) , 2005 .

[11] Noriko Kando,et al. Overview of the Patent Retrieval Task at the NTCIR-6 Workshop , 2007, NTCIR.

[12] John D. Lafferty,et al. A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[13] Chris Buckley,et al. Improving automatic query expansion , 1998, SIGIR '98.

[14] Ricardo A. Baeza-Yates,et al. Query Recommendation Using Query Logs in Search Engines , 2004, EDBT Workshops.

[15] Tatjana Zrimec,et al. CQGF: Context specific query generation framework from computerized clinical practice guidelines , 2009, 2009 Second International Conference on the Applications of Digital Information and Web Technologies.

[16] Gerard Salton,et al. The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[17] Vitor R. Carvalho,et al. Reducing long queries using query quality predictors , 2009, SIGIR.

[18] W. Bruce Croft,et al. Predicting query performance , 2002, SIGIR '02.

[19] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[20] Xiangji Huang,et al. Overview of the TREC 2011 Chemical IR Track , 2009, TREC.

[21] Falk Scholer,et al. Effective Pre-retrieval Query Performance Prediction Using Similarity and Variability Evidence , 2008, ECIR.

[22] Yuen-Hsien Tseng,et al. A study of search tactics for patentability search: a case study on patent engineers , 2008, PaIR '08.

[23] Ryen W. White,et al. Studying the use of popular destinations to enhance web search interaction , 2007, SIGIR.

[24] Iadh Ounis,et al. Inferring Query Performance Using Pre-retrieval Predictors , 2004, SPIRE.

[25] J. J. Rocchio,et al. Relevance feedback in information retrieval , 1971 .

[26] Wim Vanderbauwhede,et al. Search system requirements of patent analysts , 2010, SIGIR '10.

[27] Yusuke Sato,et al. NTCIR-5 Patent Retrieval Experiments at Hitachi , 2005, NTCIR.

[28] W. Bruce Croft,et al. Automatic query generation for patent search , 2009, CIKM.

[29] Benjamin Rey,et al. Generating query substitutions , 2006, WWW '06.

[30] Wim Vanderbauwhede,et al. A survey of patent users: an analysis of tasks, behavior, search functionality and system requirements , 2010, IIiX.

[31] Howard R. Turtle. Natural language vs. Boolean query evaluation: a comparison of retrieval performance , 1994, SIGIR '94.