CQIG: An Improved Web Search Results Clustering Algorithm

Massive linear search results returned from traditional search engines bring much inconvenience to users when extract the information they need. Search result clustering is of critical need for grouping similar topics of documents. The existing algorithm has drawbacks in clustering labels screening, cluster quality assessment, overlapping clusters controlling. The improved clustering algorithm-CQIG, which based on LINGO, improved the cluster and cluster label scoring function, increased the cluster merging process and improved the processing effect of Chinese. Finally, a recommended platform for Web search results clustering is established based on carrot2 framework to prove the accuracy, distinction and readability of CQIG.

[1]  ChengXiang Zhai,et al.  Learn from web search logs to organize search results , 2007, SIGIR.

[2]  Oren Etzioni,et al.  Grouper: A Dynamic Clustering Interface to Web Search Results , 1999, Comput. Networks.

[3]  Wei-Ying Ma,et al.  Learning to cluster web search results , 2004, SIGIR '04.

[4]  Dawid Weiss,et al.  Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition , 2004, Intelligent Information Systems.

[5]  Worapoj Kreesuradej,et al.  A New Web Search Result Clustering based on True Common Phrase Label Discovery , 2006, 2006 International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce (CIMCA'06).

[6]  Baowen Xu,et al.  ISTC: A new method for clustering search results , 2008, Wuhan University Journal of Natural Sciences.

[7]  Dell Zhang,et al.  Semantic, Hierarchical, Online Clustering of Web Search Results , 2004, APWeb.

[8]  Yoshi Gotoh DIMENSIONALITY REDUCTION TECHNIQUES FOR SEARCH RESULTS CLUSTERING , 2004 .

[9]  Xiaotie Deng,et al.  A new suffix tree similarity measure for document clustering , 2007, WWW '07.

[10]  Stanislaw Osinski,et al.  An Algorithm for Clustering of Web Search Results , 2003 .

[11]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.