Implementation of Chinese and English Clustering Engine Based on Improved Suffix Tree Algorithm
暂无分享,去创建一个
This paper presents an algorithm based on the improved suffix tree and interact-clustering idea.Hierarchical clustering for document title and summary is implemented by improved traditional suffix tree structure.Meanwhile,the interactive clustering is employed instead of traditional recursive algorithm.The algorithm is not related with language.Not only is it applicable to word-based English,but also it can deal effectively with character-based Chinese without dictionary-based Chinese word segmentation.Furthermore,the interactive clustering engine was realized on the basis of the algorithm,the system was tested in different network environments,and the performance of the system was compared with other meta-search engines.The experimemnt demonstrates that it is feasible effectively to conduct real-time interactive clustering by using the improved suffix tree algorithm.