A Query Substitution-Search Result Refinement Approach for Long Query Web Searches

Long queries are widely used in current Web applications, such as literature searches, news searches, etc. However, since long queries are frequently expressed as natural language texts but not keywords, the current keywords-based search engines, like GOOGLE, perform worse with long queries than with short ones. This paper proposes a query substitution and search result refinement approach for long query Web searches. First, we retrieved several short queries related to a long query from the users’ query history. Then, we constructed the short query clusters and selected the most representative queries to substitute the original long query. However, since searching relevant short queries may ignore contexts and terms in the original long query and thus obtain diverse results and neighboring information, we compared the contexts from search results with the contexts from original long query and filtered non-relevant results. The experiments show that our approach achieves high precision for long query Web searches.

[1]  James Allan,et al.  A Case For Shorter Queries, and Helping Users Create Them , 2007, NAACL.

[2]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[3]  Jacob Shapiro,et al.  Constructing Web search queries from the user's information need expressed in a natural language , 2003, SAC '03.

[4]  Ingrid Zukerman,et al.  Query expansion and query reduction in document retrieval , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[5]  Hao-hua Chu,et al.  Search En-gines for the World Wide Web: A Compara-tive Study and Evaluation Methodology , 1996 .

[6]  Fuchun Peng,et al.  Unsupervised query segmentation using generative language models and wikipedia , 2008, WWW.

[7]  W. Bruce Croft,et al.  Discovering key concepts in verbose queries , 2008, SIGIR '08.

[8]  Lyle H. Ungar,et al.  Web-scale named entity recognition , 2008, CIKM '08.

[9]  Dong-Yul Ra,et al.  Web Document Retrieval Using Sentence-Query Similarity , 2002, TREC.

[10]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[11]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[12]  James Allan,et al.  Effective and efficient user interaction for long queries , 2008, SIGIR '08.

[13]  Aristides Gionis,et al.  Topical query decomposition , 2008, KDD.

[14]  Hang Li,et al.  A unified and discriminative model for query refinement , 2008, SIGIR '08.

[15]  Benjamin Rey,et al.  Generating query substitutions , 2006, WWW '06.

[16]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.