Characteristic quantification method of graininess-variable text cluster
暂无分享,去创建一个
The invention provides a variable granularity text clustering characteristic quantification method, which is realized by the following steps: firstly, concept expansion of keywords of a file, namely a keyword set in the file is expanded into a concept word set with higher semantic covering capacity by utilization of a knowledge network; secondly, calculation of characteristic representation and similarity, namely the similarity between words can be comprehended as the overlap ratio of common characteristics, and the similarity between files which apply text clustering can also be judged by examining the number of the common characteristics between the files; and thirdly, achievement of the effect of variable granularity clustering through combined use of the variable granularity text clustering characteristic quantification technology and detailed clustering algorithms. The variable granularity text clustering characteristic quantification method overcomes the defect of poor clustering effect under the condition of variable granularity clustering due to inappropriate characteristic quantification of the prior file clustering system.