Discovering Search Engine Related Queries Using Association Rules

This work presents a method for online generation of query related suggestions for a Web search engine. The method uses association rules to extract related queries from the log of sbumitted queries to the search engine. Experimental results were performed on a real log containing more than 2.3 million queries submitted to a commercial search engine. For the top 5 related terms our method presented correct suggestions in 90.5% of the time. Using queries randomly selected from a log we obtained 93.45% of correct suggestions. A study of the user behavior showed that in 92.23% of the clicks on suggestions, users found useful information. The same approach can be used to provide terms to the classic problem of query expansion. For instance, the average precision of the answers of the Google search engine was improved by 23.16% using our aproach as a query expansion method.

[1]  Donna K. Harman,et al.  Relevance feedback revisited , 1992, SIGIR '92.

[2]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[3]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[4]  Ian H. Witten,et al.  Managing gigabytes (2nd ed.): compressing and indexing documents and images , 1999 .

[5]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[6]  Natalie S. Glance,et al.  Community search assistant , 2001, IUI '01.

[7]  Chris Buckley,et al.  Improving automatic query expansion , 1998, SIGIR '98.

[8]  Amanda Spink,et al.  Real life information retrieval: a study of user queries on the Web , 1998, SIGF.

[9]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[10]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[11]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[12]  Wei-Ying Ma,et al.  Probabilistic query expansion using query logs , 2002, WWW '02.

[13]  Craig Silverstein,et al.  Analysis of a Very Large Altavista Query Log" SRC Technical note #1998-14 , 1998 .

[14]  Berthier A. Ribeiro-Neto,et al.  Local versus global link information in the Web , 2003, TOIS.

[15]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[16]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[17]  Andreas Geyer-Schulz,et al.  Evaluation of Recommender Algorithms for an Internet Information Broker based on Simple Association Rules and on the Repeat-Buying Theory , 2002 .

[18]  Sergio A. Alvarez,et al.  Efficient Adaptive-Support Association Rule Mining for Recommender Systems , 2004, Data Mining and Knowledge Discovery.

[19]  Wagner Meira,et al.  Using quantitative information for efficient association rule generation , 2000, SGMD.

[20]  Yen-Jen Oyang,et al.  Relevant term suggestion in interactive web search based on contextual information in query session logs , 2003, J. Assoc. Inf. Sci. Technol..

[21]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[22]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[23]  Nancy C. M. Ross,et al.  End user searching on the Internet: An analysis of term pair topics submitted to the Excite search engine , 2000, J. Am. Soc. Inf. Sci..