A Novel Clustering Method Combining Heuristics and Information Theorem

Many data mining tasks require the unsupervised partitioning of a data set into clusters. However, in many case we do not really know any prior knowledge about the clusters, for example, the density or the shape. This paper addresses two major issues associated with conventional competitive learning, namely, sensitivity to initialization and difficulty in determining the number of clusters. Many methods exist for such clustering, but most of then have assumed hyper-ellipsoidal clusters. Many heuristically proposed competitive learning methods and its variants, are somewhat ad hoc without any theoretical support. Under above considerations, we propose an algorithm named as Entropy guided Splitting Competitive Learning (ESCL) in the information theorem framework. Simulations show that minimization of partition entropy can be used to guide the competitive learning process, so to estimate the number and structure of probable data generators.