Kernel-based MinMax clustering methods with kernelization of the metric and auto-tuning hyper-parameters

Abstract This paper proposes kernel-based MinMax clustering methods with kernelization of the metric and auto-tuning hyper-parameters which learn the variable weights and adjust the cluster weights automatically. We develop the new objective functions that are obtained from the proposed algorithms to achieve the desirable partition by minimizing the dissimilarity measures with kernelization of the metric. Correspondingly, two additional steps are introduced to k-means algorithms, so that, not only the performance is improved, but also the efficiency remains. More specifically, the proposed algorithms learn two types of weights at each iteration where variable weights identify relevant variables and cluster weights to confine the occurrence of the large variance cluster. Finally, the experiments on ten UCI benchmark datasets corroborate the superiority of the proposed algorithms.

[1]  Michael K. Ng,et al.  Automated variable weighting in k-means type clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Chieh-Yuan Tsai,et al.  Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm , 2008, Comput. Stat. Data Anal..

[3]  Yuan Zhang,et al.  Fuzzy clustering with the entropy of attribute weights , 2016, Neurocomputing.

[4]  Nanning Zheng,et al.  Kernel least mean square with adaptive kernel size , 2014, Neurocomputing.

[5]  Francisco de A. T. de Carvalho,et al.  Kernel fuzzy c-means with automatic variable weighting , 2014, Fuzzy Sets Syst..

[6]  Xian Fu,et al.  Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm , 2016, Neurocomputing.

[7]  Boris G. Mirkin,et al.  Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads , 2010, J. Classif..

[8]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[9]  Mark A. Girolami,et al.  Mercer kernel-based clustering in feature space , 2002, IEEE Trans. Neural Networks.

[10]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[11]  Michael K. Ng,et al.  An optimization algorithm for clustering using weighted dissimilarity measures , 2004, Pattern Recognit..

[12]  Adil M. Bagirov,et al.  Fast modified global k-means algorithm for incremental cluster construction , 2011, Pattern Recognit..

[13]  Tao Li,et al.  Gaussian kernel optimization: Complex problem and a simple solution , 2011, Neurocomputing.

[14]  Renato Cordeiro de Amorim,et al.  Minkowski metric, feature weighting and anomalous cluster initializing in K-Means clustering , 2012, Pattern Recognit..

[15]  Koetsu Yamazaki,et al.  Simple estimate of the width in Gaussian kernel with adaptive scaling technique , 2011, Appl. Soft Comput..

[16]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[17]  Patricio A. Vela,et al.  A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm , 2012, Expert Syst. Appl..

[18]  Tommy W. S. Chow,et al.  Organizing Books and Authors by Multilayer SOM , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Francisco de A. T. de Carvalho,et al.  Kernel-based hard clustering methods in the feature space with automatic variable weighting , 2014, Pattern Recognit..

[20]  Tommy W. S. Chow,et al.  Tree2Vector: Learning a Vectorial Representation for Tree-Structured Data , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[22]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[23]  Witold Pedrycz,et al.  Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study , 2010, Fuzzy Sets Syst..

[24]  Michael K. Ng,et al.  An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[25]  Francisco de A. T. de Carvalho,et al.  Gaussian kernel c-means hard clustering algorithms with automated computation of the width hyper-parameters , 2018, Pattern Recognit..

[26]  Aristidis Likas,et al.  The MinMax k-Means clustering algorithm , 2014, Pattern Recognit..

[27]  Yunming Ye,et al.  TW-k-means: Automated two-level variable weighting clustering algorithm for multiview data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[28]  Francisco de A. T. de Carvalho,et al.  Kernel-based hard clustering methods with kernelization of the metric and automatic weighting of the variables , 2016, Pattern Recognit..

[29]  P. Green,et al.  A preliminary study of optimal variable weighting in k-means clustering , 1990 .

[30]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Xiang Li,et al.  A fuzzy minimax clustering model and its applications , 2012, Inf. Sci..

[32]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[33]  Yunming Ye,et al.  Extensions of Kmeans-Type Algorithms: A New Clustering Framework by Integrating Intracluster Compactness and Intercluster Separation , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[34]  Pedro Larrañaga,et al.  An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[35]  Dimitrios Gunopulos,et al.  Locally adaptive metrics for clustering high dimensional data , 2007, Data Mining and Knowledge Discovery.