Query Expansion Powered by Wikipedia Hyperlinks

This research introduces a new query expansion method that uses Wikipedia and its hyperlink structure to find related terms for reformulating a query. Queries are first understood better by splitting into query aspects. Further understanding is gained through measuring how well each aspect is represented in the original search results. Poorly represented aspects are found to be an excellent source of query improvement. Our main contribution is the way of using Wikipedia to identify aspects and underrepresented aspects, and to weight the expansion terms. Results have shown that our approach improves the original query and search results, and outperforms two existing query expansion methods.

[1]  Xue Jiang Query expansion based on a semantic graph model , 2011, SIGIR '11.

[2]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[3]  Vitaly Klyuev,et al.  Query expansion: Term selection using the EWC semantic relatedness measure , 2011, 2011 Federated Conference on Computer Science and Information Systems (FedCSIS).

[4]  Ian H. Witten,et al.  A knowledge-based search engine powered by wikipedia , 2007, CIKM '07.

[5]  Daniel Gayo-Avello,et al.  Stratified analysis of AOL query log , 2009, Inf. Sci..

[6]  Dong Nguyen,et al.  Combination of Evidence for Effective Web Search , 2010, TREC.

[7]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[8]  Mounia Lalmas,et al.  A survey on the use of relevance feedback for information access systems , 2003, The Knowledge Engineering Review.

[9]  P. Smith,et al.  A review of ontology based query expansion , 2007, Inf. Process. Manag..

[10]  Delphine Bernhard,et al.  Query Expansion based on Pseudo Relevance Feedback from Definition Clusters , 2010, COLING.

[11]  Charles L. A. Clarke,et al.  Overview of the TREC 2011 Web Track , 2011, TREC.

[12]  Yang Xu,et al.  Query dependent pseudo-relevance feedback based on wikipedia , 2009, SIGIR.

[13]  Gary Marchionini,et al.  Examining the effectiveness of real-time query expansion , 2007, Inf. Process. Manag..

[14]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[15]  Xiaoying Gao,et al.  Improving AbraQ: An Automatic Query Expansion Algorithm , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[16]  Xiaoying Gao,et al.  Exploiting underrepresented query aspects for automatic query expansion , 2007, KDD '07.

[17]  Julio Gonzalo,et al.  Wikipedia as Sense Inventory to Improve Diversity in Web Search Results , 2010, ACL.

[18]  Charles L. A. Clarke,et al.  Overview of the TREC 2011 Web Track | NIST , 2011 .

[19]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[20]  Maarten de Rijke,et al.  Supervised query modeling using wikipedia , 2010, SIGIR '10.

[21]  Charles L. A. Clarke,et al.  Overview of the TREC 2010 Web Track , 2010, TREC.