Association Rule Centric Clustering of Web Search Results

Information abundance induced due to the ambiguous queries demands soft computing strategies. This problem can be addressed by Search Results Clustering. This paper presents a novel approach to the web search results clustering based on association rules using the Snowball technique. Association rule mining is employed on terms extracted from title and snippet of the search results. The detailed algorithm and experimental results on data sets of ambiguous queries are presented.

[1]  Roberto Navigli,et al.  Inducing Word Senses to Improve Web Search Result Clustering , 2010, EMNLP.

[2]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[3]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[4]  Jianhua Ma,et al.  Search Results Clustering Based on Suffix Array and VSM , 2010, 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing.

[5]  Irmina Masłowska Phrase-based hierarchical clustering of web search results , 2003 .

[6]  Dawid Weiss,et al.  A concept-driven algorithm for clustering search results , 2005, IEEE Intelligent Systems.

[7]  Vikram Pudi,et al.  Frequent Itemset Based Hierarchical Document Clustering Using Wikipedia as External Knowledge , 2010, KES.

[8]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[9]  Claudio Carpineto,et al.  Full-Subtopic Retrieval with Keyphrase-Based Search Results Clustering , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[10]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[11]  Claudio Carpineto,et al.  Exploiting the Potential of Concept Lattices for Information Retrieval with CREDO , 2004, J. Univers. Comput. Sci..

[12]  Shourya Roy,et al.  A hierarchical monothetic document clustering algorithm for summarization and browsing search results , 2004, WWW '04.

[13]  Hung Son Nguyen,et al.  A Tolerance Rough Set Approach to Clustering Web Search Results , 2004, PKDD.

[14]  Wei-Ying Ma,et al.  Learning to cluster web search results , 2004, SIGIR '04.

[15]  Dawid Weiss,et al.  Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition , 2004, Intelligent Information Systems.

[16]  Dell Zhang,et al.  Semantic, Hierarchical, Online Clustering of Web Search Results , 2004, APWeb.

[17]  Lakhmi C. Jain,et al.  Knowledge-Based Intelligent Information and Engineering Systems , 2004, Lecture Notes in Computer Science.

[18]  Dino Pedreschi,et al.  Knowledge Discovery in Databases: PKDD 2004 , 2004, Lecture Notes in Computer Science.

[19]  Yanchun Zhang,et al.  Advanced Web Technologies and Applications , 2004, Lecture Notes in Computer Science.

[20]  Dawid Weiss,et al.  A survey of Web clustering engines , 2009, CSUR.

[21]  Paolo Ferragina,et al.  The Anatomy of SnakeT: A Hierarchical Clustering Engine for Web-Page Snippets , 2004, PKDD.

[22]  Stanislaw Osinski Improving Quality of Search Results Clustering with Approximate Matrix Factorisations , 2006, ECIR.

[23]  Peter Ingwersen,et al.  Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[24]  Craig Silverstein,et al.  Analysis of a Very Large Altavista Query Log" SRC Technical note #1998-14 , 1998 .

[25]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.