Hierarchical clustering with planar segments as prototypes

Abstract Clustering methods divide a set of observations into groups in such a way that members of the same group are more similar to one another than to the members of the other groups. One of the scientifically well known methods of clustering is the hierarchical agglomerative one. For data of different properties different clustering methods appear favorable. If the data possess locally linear form, application of planar (or hyperplanar) prototypes should be advantageous. However, although a clustering method using planar prototypes, based on a criterion minimization, is known, it has a crucial drawback. It is an infinite extent of such prototypes that can result in addition of very distant data points to a cluster. Such distant points can considerably differ from the majority within a cluster. The goal of this work is to overcome this problem by developing a hierarchical agglomerative clustering method that uses the prototypes confined to the segments of hyperplanes. In the experimental part, we show that for data that possess locally linear form this method is highly competitive to the method of the switching regression models (the accuracy improvement of 24%) as well as to other well-known clustering methods (the accuracy improvement of 16%).

[1]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[2]  B. Eswara Reddy,et al.  Speeding-up the kernel k-means clustering method: A prototype based hybrid approach , 2013, Pattern Recognit. Lett..

[3]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[4]  Jerry M. Mendel,et al.  Optimal Seismic Deconvolution: An Estimation-Based Approach , 2013 .

[5]  Mohamad M. Awad,et al.  Multi-component image segmentation using a hybrid dynamic genetic algorithm and fuzzy C-means , 2009, IET Image Process..

[6]  R.J. Hathaway,et al.  Switching regression models and fuzzy clustering , 1993, IEEE Trans. Fuzzy Syst..

[7]  Jianhong Wu,et al.  Data clustering - theory, algorithms, and applications , 2007 .

[8]  Zongben Xu,et al.  A heuristic hierarchical clustering based on multiple similarity measurements , 2013, Pattern Recognit. Lett..

[9]  Jian-Huang Lai,et al.  APSCAN: A parameter free algorithm for clustering , 2011, Pattern Recognit. Lett..

[10]  G. N. Lance,et al.  A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems , 1967, Comput. J..

[11]  Jian Xiao,et al.  A modified interval type-2 fuzzy C-means algorithm with application in MR image segmentation , 2013, Pattern Recognit. Lett..

[12]  David L. Clarke,et al.  Models in archaeology , 1972 .

[13]  M. Venkataswamy Reddy Statistical Methods in Psychiatry Research and SPSS , 2014 .

[14]  Mohamad M. Awad,et al.  Satellite image segmentation using Self- Organizing Maps and Fuzzy C-Means , 2009, 2009 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[15]  Jacek M. Leski,et al.  Application of entropy and energy measures of fuzziness to processing of ECG signal , 1998, Fuzzy Sets Syst..

[16]  Charu C. Aggarwal,et al.  Data Clustering , 2013 .

[17]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[18]  Sio Iong Ao,et al.  CLUSTAG: hierarchical clustering and graph methods for selecting tag SNPs , 2005, Bioinform..

[19]  Fatemeh Afsari,et al.  Scalable semi-supervised clustering by spectral kernel learning , 2014, Pattern Recognit. Lett..

[20]  Mirkin Boris,et al.  Clustering: A Data Recovery Approach , 2012 .

[21]  Robert Tibshirani,et al.  Hierarchical Clustering With Prototypes via Minimax Linkage , 2011, Journal of the American Statistical Association.

[22]  Duoqian Miao,et al.  DIVFRP: An automatic divisive hierarchical clustering method based on the furthest reference points , 2008, Pattern Recognit. Lett..

[23]  Jacek M. Łȩski,et al.  Neuro-fuzzy system with learning tolerant to imprecision , 2003 .

[24]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[25]  Utku Kose,et al.  Artificial Intelligence Applications in Distance Education , 2014 .

[26]  Xudong Jiang,et al.  A multi-prototype clustering algorithm , 2009, Pattern Recognit..

[27]  Brian Everitt,et al.  Cluster analysis , 1974 .

[28]  Anjana Gosain,et al.  RETRACTED: A robust kernelized intuitionistic fuzzy c-means clustering algorithm in segmentation of noisy medical images , 2013 .

[29]  George K. Matsopoulos,et al.  Self-Organizing Maps , 2010 .