A New Algorithm for Discriminative Clustering and Its Maximum Entropy Extension

Discriminative clustering DC can effectively integrates subspace selection and clustering into a coherent framework. It performs in the iterative classical Linear Discriminant Analysis LDA dimensionality reduction and clustering processing. DC can effectively cluster the data with high dimension. However, it has complex form and high computational complexity. Recent work shows DC is equivalent to kernel k-means KM with a specific kernel matrix. This new insights provides a chance of simplifying the optimization problem in the original DC algorithm. Based on this equivalence relationship, Discriminative K-means DKM algorithm is proposed. When the number of data points denoted as n is small, DKM is feasible and efficient. However, the construction of kernel matrix needs to compute the inverse of a matrix in DKM, when n is large, which is time consuming. In this paper, we concentratei?źon the efficiency of DC. We present a new framework for DC, namely, Efficient DC EDC, which consists of DKM and the whitening transformation of the regularized total scatter matrix WRTS plus KM clustering WRTS+KM. When m dimensions is small and n far outweighs m, namely, ni?źi?źi?źm, EDC can carry out WRTS+KM on data, which is more efficient than DKM. When n is small and m far outweighs n, namely, mi?źi?źi?źn, EDC can carry out DKM on data, which is more efficient. We also extend EDC to soft case, and propose Efficient Discriminative Maximum Entropy Clustering EDMEC, which is an efficient version of maximum entropy based DC. Extensive experiments on a collection of benchmark data sets are presented to show the effectiveness of the proposed algorithms.

[1]  N. Karayiannis MECA: maximum entropy clustering algorithm , 1994, Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference.

[2]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Chin-Teng Lin,et al.  LDA-Based Clustering Algorithm and Its Application to an Unsupervised Feature Extraction , 2011, IEEE Transactions on Fuzzy Systems.

[4]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  J. Friedman Regularized Discriminant Analysis , 1989 .

[6]  Takeo Kanade,et al.  Discriminative cluster analysis , 2006, ICML.

[7]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[8]  Jieping Ye,et al.  Computational and Theoretical Analysis of Null Space and Orthogonal Linear Discriminant Analysis , 2006, J. Mach. Learn. Res..

[9]  Jieping Ye,et al.  Discriminative K-means for Clustering , 2007, NIPS.

[10]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[11]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[12]  Robert P. W. Duin,et al.  Expected classification error of the Fisher linear classifier with pseudo-inverse covariance matrix , 1998, Pattern Recognit. Lett..

[13]  Roger E Bumgarner,et al.  Clustering gene-expression data with repeated measurements , 2003, Genome Biology.

[14]  Yadong Wang,et al.  A proof of the convergence theorem of maximum-entropy clustering algorithm , 2010, Science China Information Sciences.

[15]  Songcan Chen,et al.  Regularized soft K-means for discriminant analysis , 2013, Neurocomputing.

[16]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[17]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[18]  Nanning Zheng,et al.  Maximum-entropy clustering algorithm and its global convergence analysis , 2001 .

[19]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[20]  Markus Breitenbach,et al.  Clustering through ranking on manifolds , 2005, ICML '05.

[21]  Feng Zhao,et al.  Fuzzy Linear Discriminant Analysis-guided maximum entropy fuzzy clustering algorithm , 2013, Pattern Recognit..

[22]  Rui-Ping Li,et al.  A maximum-entropy approach to fuzzy clustering , 1995, Proceedings of 1995 IEEE International Conference on Fuzzy Systems..