Variance Based Moving K-Means Algorithm

Clustering is a useful data exploratory method with its wide applicability in multiple fields. However, data clustering greatly relies on initialization of cluster centers that can result in large intra-cluster variance and dead centers, therefore leading to sub-optimal solutions. This paper proposes a novel variance based version of the conventional Moving K-Means (MKM) algorithm called Variance Based Moving K-Means (VMKM) that can partition data into optimal homogeneous clusters, irrespective of cluster initialization. The algorithm utilizes a novel distance metric and a unique data element selection criteria to transfer the selected elements between clusters to achieve low intra-cluster variance and subsequently avoid dead centers. Quantitative and qualitative comparison with various clustering techniques is performed on four datasets selected from image processing, bioinformatics, remote sensing and the stock market respectively. An extensive analysis highlights the superior performance of the proposed method over other techniques.

[1]  Sukriti Jain,et al.  A Novel Method to Improve Model fitting for Stock Market Prediction , 2013 .

[2]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[3]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[5]  George Arimond,et al.  A Clustering Method for Categorical Data in Tourism Market Segmentation Research , 2001 .

[6]  M. Y. Mashor Hybrid training algorithm for RBF Network , 2000 .

[7]  Amarjot Singh,et al.  A Comparison of Biclustering with Clustering Algorithms , 2011, 2011 Third Pacific-Asia Conference on Circuits, Communications and System (PACCS).

[8]  Nishchal K. Verma,et al.  A comparison of biclustering algorithms , 2010, 2010 International Conference on Systems in Medicine and Biology.

[9]  Amarjot Singh,et al.  An Experimental Comparison of Face Detection Algorithms , 2013 .

[10]  S. N. Omkar,et al.  Crop classification using biologically-inspired techniques with high resolution satellite image , 2008 .

[11]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[12]  Nor Ashidi Mat Isa,et al.  Enhanced moving K-means (EMKM) algorithm for image segmentation , 2011, IEEE Transactions on Consumer Electronics.

[13]  Georgios Tziritas,et al.  Face Detection Using Quantized Skin Color Regions Merging and Wavelet Packet Analysis , 1999, IEEE Trans. Multim..

[14]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.