Document clustering aims at grouping the documents that are coherent internally with substantial difference amongst different groups. Due to huge availability of documents, the clustering face scalability and accuracy issues. Moreover, there is a dearth for a tool that performs clustering of such voluminous data efficiently. Conventional models focus either on fully centralized or fully distributed approach for document clustering. Hence, this paper proposes a novel approach to perform document clustering by modifying the conventional Self Organizing Map (SOM). The contribution of this work is threefold. The first is a distributed approach to pre-process the documents; the second being an adaptive bottom-up approach towards document clustering and the third being a neighbourhood model suitable for Ring Topology for document clustering. Experimentation on real datasets and comparison with traditional SOM show the efficacy of the proposed approach.
[1]
Dino Isa,et al.
Using the self organizing map for clustering of text documents
,
2009,
Expert Syst. Appl..
[2]
Christian Callegari,et al.
Advances in Computing, Communications and Informatics (ICACCI)
,
2015
.
[3]
Samuel Kaski,et al.
Self organization of a massive document collection
,
2000,
IEEE Trans. Neural Networks Learn. Syst..
[4]
Teuvo Kohonen,et al.
The self-organizing map
,
1990
.
[5]
Laurene V. Fausett,et al.
Fundamentals Of Neural Networks
,
1994
.
[6]
Remya R. K. Menon,et al.
Document Classification with Hierarchically Structured Dictionaries
,
2016
.
[7]
Yuanchao Liu,et al.
Research of fast SOM clustering for text information
,
2011,
Expert Syst. Appl..