An adaptive distributed approach of a self organizing map model for document clustering using ring topology

Document clustering aims at grouping the documents that are coherent internally with substantial difference amongst different groups. Due to huge availability of documents, the clustering face scalability and accuracy issues. Moreover, there is a dearth for a tool that performs clustering of such voluminous data efficiently. Conventional models focus either on fully centralized or fully distributed approach for document clustering. Hence, this paper proposes a novel approach to perform document clustering by modifying the conventional Self Organizing Map (SOM). The contribution of this work is threefold. The first is a distributed approach to pre-process the documents; the second being an adaptive bottom-up approach towards document clustering and the third being a neighbourhood model suitable for Ring Topology for document clustering. Experimentation on real datasets and comparison with traditional SOM show the efficacy of the proposed approach.