Kohonen Self Organizing Map with Modified K-means clustering For High Dimensional Data Set

Since it was first proposed, it is amazing to notice how KMeans algorithm has survive over the years. It has been one among the well known algorithms for data clustering in the field of data mining. Day in and day out new algorithms are evolving for data clustering purposes but none can be as fast and accurate as the K-Means algorithm. But in spite of its huge speed, accuracy and simplicity K-Means has suffered from some of its own problem. Such as, the exact number of cluster is not known prior to clustering. The other thing that is causing problem is that it is quite sensitive to initial centroids. Not just that, K-Means fails to give optimum result when it comes to clustering high dimensional data set because its complexity tends to make things more complicated when more number of dimensions are added. In Data Mining this problem is known as “Curse of High Dimensionality”. Here in our paper we proposed a new Modified K-Means algorithm that will overcome the problem faced by the standard KMeans algorithm. We proposed the use of Kohonen Self Organizing Map (KSOM) so as to visualize exact number of clusters before clustering and genetic algorithm is applied for initialization. The Kohonen Self Organizing Map (KSOM) with Modified K-Means algorithm is tested on an iris data set and its performance is compared with other clustering algorithm and is found out to be more accurate, with less number of classification and quantization errors and can be applied even for high dimensional dataset.

[1]  Mohd. Noor Md. Sap,et al.  Hybrid self organizing map for overlapping clusters , 2008 .

[2]  H. S Behera,et al.  AN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT’S PERFORMANCE ANALYSIS , 2011 .

[3]  Juha Vesanto,et al.  SOM-based data visualization methods , 1999, Intell. Data Anal..

[4]  Chen Zhang,et al.  K-means Clustering Algorithm with Improved Initial Center , 2009, 2009 Second International Workshop on Knowledge Discovery and Data Mining.

[5]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..