WordNet Based Concept Weight using Semantic Relation for Clustering Documents

This paper presents a novel technique by combining regular clustering techniques with information extracted from WordNet. There are two approaches for traditional clustering algorithms utilize in documents clustering area. First approach work with documents as bag of words and consider each term as independent (means ignore semantic relationships between words). Second approach can determine semantics using WordNet. The proposed technique isutilizing second approach with different (identity, synonym,direct hypernym and meronym relation) &weighted (identity > synonym >direct hypernym > meronym)semantic relation. Concepts are weighted by generating concepts chain of related concepts. It utilizes the WordNet in turn to create low dimensional vector space which allows to build an efficient clustering technique. The proposed technique can improve cluster quality as well as achieve low dimensional vector space compared to other techniques.