论文信息 - AN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT’S PERFORMANCE ANALYSIS

AN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT’S PERFORMANCE ANALYSIS

In practical life we can see the rapid growth in the various data objects around us, which thereby demands the increase of features and attributes of the data set. This phenomenon, in turn leads to the increase of dimensions of the various data sets. When increase of dimension occurred, the ultimate problem referred to as the ‘the curse of dimensionality’ comes in to picture. For this reason, in order to mine a high dimensional data set an improved and an efficient dimension reduction technique is very crucial and apparently can be considered as the need of the hour. Numerous methods have been proposed and many experimental analyses have been done to find out an efficient reduction technique so as to reduce the dimension of a high dimensional data set without affecting the original data’s. In this paper we proposed the use of Canonical Variate analysis, which serves the purpose of reducing the dimensions of a high dimensional dataset in a more efficient and effective manner. Then to the reduced low dimensional data set, a clustering technique is applied using a modified k-means clustering. In our paper for the purpose of initializing the initial centroids of the Improved Hybridized K Means clustering algorithm (IHKMCA) we make use of genetic algorithm, so as to get a more accurate result. The results thus found from the proposed work have better accuracy, more efficient and less time complexity as compared to other approaches.

H. S Behera | Rosly Boy Lingdoh | Diptendra Kodamasingh

[1] Chen Zhang,et al. K-means Clustering Algorithm with Improved Initial Center , 2009, 2009 Second International Workshop on Knowledge Discovery and Data Mining.