Semantic-Based Hierarchicalize the Result of Suffix Tree Clustering

Suffix tree clustering is a fast, incremental, linear time clustering algorithm, but there are synonymous and label-contained relations among the result clusters. So just return these results to the users directly, would give them an added burden. In response to this problem, this paper presents a method that merging the semantic duplicate clusters and hierarchicalizing the label-contained clusters. The experimental results show that this method can effectively remove semantic duplication and hierarchicalize label-contained clusters clearly. It improves the organization of clustering results. To the STC search engine, this will provide users with better results and better classification.