论文信息 - CWS: a comparative web search system

CWS: a comparative web search system

In this paper, we define and study a novel search problem: Comparative Web Search (CWS). The task of CWS is to seek relevant and comparative information from the Web to help users conduct comparisons among a set of topics. A system called CWS is developed to effectively facilitate Web users' comparison needs. Given a set of queries, which represent the topics that a user wants to compare, the system is characterized by: (1) automatic retrieval and ranking of Web pages by incorporating both their relevance to the queries and the comparative contents they contain; (2) automatic clustering of the comparative contents into semantically meaningful themes; (3) extraction of representative keyphrases to summarize the commonness and differences of the comparative contents in each theme. We developed a novel interface which supports two types of view modes: a pair-view which displays the result in the page level, and a cluster-view which organizes the comparative pages into the themes and displays the extracted phrases to facilitate users' comparison. Experiment results show the CWS system is effective and efficient.

[1] Shourya Roy,et al. A hierarchical monothetic document clustering algorithm for summarization and browsing search results , 2004, WWW '04.

[2] Mo Chen,et al. A practical system of keyphrase extraction for web pages , 2005, CIKM '05.

[3] Bing Liu,et al. Visualizing web site comparisons , 2002, WWW '02.

[4] Kenneth Steiglitz,et al. Combinatorial Optimization: Algorithms and Complexity , 1981 .

[5] Bing Liu,et al. Mining and summarizing customer reviews , 2004, KDD.

[6] Monika Henzinger,et al. Query-Free News Search , 2003, WWW '03.

[7] Bei Yu,et al. A cross-collection mixture model for comparative text mining , 2004, KDD.

[8] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[9] Katsumi Tanaka,et al. A comparative web browser (CWB) for browsing and comparing web pages , 2003, WWW '03.

[10] Carl Gutwin,et al. KEA: practical automatic keyphrase extraction , 1999, DL '99.

[11] Hongyuan Zha,et al. Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering , 2002, SIGIR '02.

[12] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13] Wei-Ying Ma,et al. Learning to cluster web search results , 2004, SIGIR '04.

[14] Philip S. Yu,et al. Discovering unexpected information from your competitors' web sites , 2001, KDD '01.

[15] Bing Liu,et al. Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[16] Tao Tao,et al. A formal study of information retrieval heuristics , 2004, SIGIR '04.

[17] ChengXiang Zhai,et al. CTMS : A Comparative Text Mining System , 2005 .

[18] George Karypis,et al. A Comparison of Document Clustering Techniques , 2000 .

[19] Tao Tao,et al. Mining comparable bilingual text corpora for cross-language information integration , 2005, KDD '05.