Multi-aspect query summarization by composite query

Conventional search engines usually return a ranked list of web pages in response to a query. Users have to visit several pages to locate the relevant parts. A promising future search scenario should involve: (1) understanding user intents; (2) providing relevant information directly to satisfy searchers' needs, as opposed to relevant pages. In this paper, we present a search paradigm to summarize a query's information from different aspects. Query aspects could be aligned to user intents. The generated summaries for query aspects are expected to be both specific and informative, so that users can easily and quickly find relevant information. Specifically, we use a Composite Query for Summarization" method, where a set of component queries are used for providing additional information for the original query. The system leverages the search engine to proactively gather information by submitting multiple component queries according to the original query and its aspects. In this way, we could get more relevant information for each query aspect and roughly classify information. By comparative mining the search results of different component queries, it is able to identify query (dependent) aspect words, which help to generate more specific and informative summaries. The experimental results on two data sets, Wikipedia and TREC ClueWeb2009, are encouraging. Our method outperforms two baseline methods on generating informative summaries.

[1]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[2]  Yiqun Liu,et al.  Overview of the NTCIR-9 INTENT Task , 2011, NTCIR.

[3]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[4]  Joshua Goodman,et al.  Multi-Document Summarization by Maximizing Informative Content-Words , 2007, IJCAI.

[5]  Deepayan Chakrabarti,et al.  Mining broad latent query aspects from search sessions , 2009, KDD.

[6]  Changhu Wang,et al.  Learning query-biased web page summarization , 2007, CIKM '07.

[7]  Ji-Rong Wen,et al.  Multi-dimensional search result diversification , 2011, WSDM '11.

[8]  Jayant Madhavan,et al.  Identifying Aspects for Web-Search Queries , 2011, J. Artif. Intell. Res..

[9]  W. Bruce Croft,et al.  Generating hierarchical summaries for web searches , 2003, SIGIR '03.

[10]  Shourya Roy,et al.  A hierarchical monothetic document clustering algorithm for summarization and browsing search results , 2004, WWW '04.

[11]  Ryen W. White,et al.  Exploratory Search: Beyond the Query-Response Paradigm , 2009, Exploratory Search: Beyond the Query-Response Paradigm.

[12]  Susan T. Dumais,et al.  Bringing order to the Web: automatically categorizing search results , 2000, CHI.

[13]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[14]  Benjamin M. Good,et al.  Tag clouds for summarizing web search results , 2007, WWW '07.

[15]  W. Bruce Croft,et al.  Inferring query aspects from reformulations using clustering , 2011, CIKM '11.

[16]  Tao Li,et al.  Topic aspect analysis for multi-document summarization , 2010, CIKM '10.

[17]  Yihong Gong,et al.  Multi-Document Summarization using Sentence-based Topic Models , 2009, ACL.

[18]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[19]  Jaime Carbonell,et al.  Multi-Document Summarization By Sentence Extraction , 2000 .

[20]  Marti A. Hearst Clustering versus faceted categories for information exploration , 2006, Commun. ACM.

[21]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[22]  ChengXiang Zhai,et al.  Learn from web search logs to organize search results , 2007, SIGIR.

[23]  Wei-Ying Ma,et al.  Learning to cluster web search results , 2004, SIGIR '04.

[24]  Ryen W. White,et al.  Exploratory Search , 2008 .

[25]  Ani Nenkova,et al.  A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization , 2006, SIGIR.

[26]  Mark Sanderson,et al.  Advantages of query biased summaries in information retrieval , 1998, SIGIR '98.

[27]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[28]  Xu Ling,et al.  Mining multi-faceted overviews of arbitrary topics in a text collection , 2008, KDD.