Information cut for clustering using a gradient descent approach

We introduce a new graph cut for clustering which we call the Information Cut. It is derived using Parzen windowing to estimate an information theoretic distance measure between probability density functions. We propose to optimize the Information Cut using a gradient descent-based approach. Our algorithm has several advantages compared to many other graph-based methods in terms of determining an appropriate affinity measure, computational complexity, memory requirements and coping with different data scales. We show that our method may produce clustering and image segmentation results comparable or better than the state-of-the art graph-based methods.

[1]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[2]  Guy L. Scott,et al.  Feature grouping by 'relocalisation' of eigenvectors of the proximity matrix , 1990, BMVC.

[3]  Sergios Theodoridis,et al.  Pattern Recognition, Third Edition , 2006 .

[4]  J.C. Principe,et al.  Information theoretic spectral clustering , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[5]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[6]  Michael Werman,et al.  Self-Organization in Vision: Stochastic Clustering for Image Segmentation, Perceptual Grouping, and Image Database Organization , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Charles J. Alpert,et al.  Spectral Partitioning: The More Eigenvectors, The Better , 1995, 32nd Design Automation Conference.

[8]  Stephen J. Roberts,et al.  Maximum certainty data partitioning , 2000, Pattern Recognit..

[9]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[10]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[11]  Richard M. Leahy,et al.  An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[13]  José Carlos Príncipe,et al.  Information Theoretic Clustering , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[15]  J. Simonoff Multivariate Density Estimation , 1996 .

[16]  Deniz Erdogmus,et al.  Information Theoretic Learning , 2005, Encyclopedia of Artificial Intelligence.

[17]  Anil K. Jain,et al.  Unsupervised texture segmentation using Gabor filters , 1990, 1990 IEEE International Conference on Systems, Man, and Cybernetics Conference Proceedings.

[18]  Satosi Watanabe,et al.  Pattern Recognition: Human and Mechanical , 1985 .

[19]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[20]  Naftali Tishby,et al.  Data Clustering by Markovian Relaxation and the Information Bottleneck Method , 2000, NIPS.

[21]  Robert Jenssen,et al.  Information cut and information forces for clustering , 2003, 2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718).

[22]  Joachim M. Buhmann,et al.  Pairwise Data Clustering by Deterministic Annealing , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Singiresu S. Rao Engineering Optimization : Theory and Practice , 2010 .

[24]  H. L. Le Roy,et al.  Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; Vol. IV , 1969 .

[25]  Larry S. Davis,et al.  Efficient Kernel Density Estimation Using the Fast Gauss Transform with Applications to Color Modeling and Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[27]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[28]  N. Deo,et al.  Graph-theoretic algorithms for image segmentation , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[29]  James C. Bezdek,et al.  A Convergence Theorem for the Fuzzy ISODATA Clustering Algorithms , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Geoffrey C. Fox,et al.  Vector quantization by deterministic annealing , 1992, IEEE Trans. Inf. Theory.

[32]  D. Higham,et al.  A Unified View of Spectral Clustering ∗ , 2004 .

[33]  Anna R. Karlin,et al.  Spectral analysis of data , 2001, STOC '01.

[34]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[35]  Robert Jenssen,et al.  Optimizing the Cauchy-Schwarz PDF Distance for Information Theoretic, Non-parametric Clustering , 2005, EMMCVPR.

[36]  Yair Weiss,et al.  Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.