Search Result Clustering Based on Query Context

This paper introduces a novel, interactive and exploratory, approach to information retrieval search engines based on clustering. Presented method allows users to change the clustering structure by applying a free-text clustering context query that is treated as a criterion for document-to-cluster allocation. Exploration mechanisms are delivered by redefining the interaction scenario in which the user can interact with data on the level of topic discovery or cluster labeling. In this paper, the presented idea is realized by a graph structure called the Query-Summarize Graph. This data structure is useful in the definition of the similarity measure between the snippets as well as in the snippet clustering algorithm. The experiments on real-world data are showing that the proposed solution has many interesting properties and can be an alternative approach to interactive information retrieval.

[1]  Abraham Kandel,et al.  Graph-Theoretic Techniques for Web Content Mining , 2005, Series in Machine Perception and Artificial Intelligence.

[2]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[3]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.

[4]  Marti A. Hearst,et al.  Reexamining the cluster hypothesis: scatter/gather on retrieval results , 1996, SIGIR '96.

[5]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[6]  Michael Jackman,et al.  Conceptual graphs , 1988 .

[7]  Robert B. Allen,et al.  An interface for navigating clustered document sets returned by queries , 1993, COCS '93.

[8]  M. E. Maron,et al.  An evaluation of retrieval effectiveness for a full-text document-retrieval system , 1985, CACM.

[9]  Panagiotis G. Ipeirotis,et al.  Automatic construction of multifaceted browsing interfaces , 2005, CIKM '05.

[10]  Gary Marchionini,et al.  Exploratory search , 2006, Commun. ACM.

[11]  Wei-Ying Ma,et al.  Learning to cluster web search results , 2004, SIGIR '04.

[12]  Hasan Davulcu,et al.  Term Ranking for Clustering Web Search Results , 2007, WebDB.

[13]  Madalina Croitoru,et al.  Conceptual Graphs Based Information Retrieval in HealthAgents , 2007, Twentieth IEEE International Symposium on Computer-Based Medical Systems (CBMS'07).

[14]  W. Bruce Croft,et al.  An Evaluation of Techniques for Clustering Search Results , 2005 .

[15]  William M. Pottenger,et al.  Detecting Patterns in the LSI Term-Term Matrix , 2002 .

[16]  Alexander F. Gelbukh,et al.  Information Retrieval with a Simplified Conceptual Graph-Like Representation , 2010, MICAI.

[17]  Mark Last,et al.  Graph-Based Keyword Extraction for Single-Document Summarization , 2008, COLING 2008.

[18]  P. Zunde,et al.  Indexing Consistency and Quality. , 1969 .

[19]  Huilin Wang,et al.  Document Clustering Description Extraction and Its Application , 2009, ICCPOL.

[20]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[21]  Dawid Weiss,et al.  Extending k-means with the description comes first approach , 2007 .

[22]  Hans Friedrich Witschel Multi-level Association Graphs - A New Graph-Based Model for Information Retrieval , 2007 .