An approach to use query-related web context on document ranking

With the development of Web search engines, it is considered as an important task to provide retrieved documents in a proper manner. Many search engines have used various document ranking algorithms to provide their retrieved documents in a more efficient way for users. However, even though a good algorithm is used, there are some limitations if they do not consider the characteristic of queries which is diverse depending on user intention or interest. Even if a user searches documents with the same query, he/she may want a different result depending on when he/she queries into a search engine. How can a search engine judge what way is more efficient to provide retrieved results? We suggest a simple and novel way which employs query-related Web context to answer this question. With the distribution of query-related tweets and news articles, we classify whether a query would be considered as a hot query or a cold query. And then, we extract major topic terms from the hot time slice if a query is classified as a hot query, or extract refined contents if a query is classified as a cold query. Finally, all retrieved results are re-ranked by reflecting these topic terms or refined contents according to the characteristic of the query. To show the meaningfulness of our approach, we compare our re-ranked results with original retrieved results from the commercial search engine.

[1]  Steve Chien,et al.  Semantic similarity between search engine queries using temporal correlation , 2005, WWW '05.

[2]  Amanda Spink,et al.  Model for organizational knowledge creation and strategic use of information: Research Articles , 2005 .

[3]  Tsuyoshi Murata,et al.  Detection of Breaking News from Online Web Search Queries , 2007, New Generation Computing.

[4]  W. Bruce Croft,et al.  Inference networks for document retrieval , 1989, SIGIR '90.

[5]  Luis Gravano,et al.  Categorizing web queries according to geographical locality , 2003, CIKM '03.

[6]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[7]  Tsuyoshi Murata,et al.  Towards the Detection of Breaking News from Online Web Search Keywords , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops.

[8]  Pavel Blagoveston Bochev,et al.  A vector space model for information retrieval with generalized similarity measures. , 2012 .

[9]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[10]  Yi Liu,et al.  Clustering queries for better document ranking , 2009, CIKM.

[11]  Xiaozhong Liu,et al.  Computational community interest for ranking , 2009, CIKM.

[12]  Ophir Frieder,et al.  Automatic web query classification using labeled and unlabeled training data , 2005, SIGIR '05.

[13]  Ophir Frieder,et al.  Automatic classification of Web queries using very large unlabeled query logs , 2007, TOIS.

[14]  Amanda Spink,et al.  A temporal comparison of AltaVista Web searching , 2005, J. Assoc. Inf. Sci. Technol..

[15]  S. Robertson The probability ranking principle in IR , 1997 .