Chinese Keywords Clustering Based on SOM

Keyword clustering is useful for text information retrieval, text document classification and so on. This paper introduces an unsupervised method to cluster Chinese keyword by the artificial neural network of SOM (self-organized map). Keywords are encoded into numeric vectors by the similarities of their contextual word sets, which are composed by their neighbor words in the range of phrases. The experimental result shows that words can be clustered on the map according to both of their syntactic and semantic features.