Entropy-Rate Clustering: Cluster Analysis via Maximizing a Submodular Function Subject to a Matroid Constraint

We propose a new objective function for clustering. This objective function consists of two components: the entropy rate of a random walk on a graph and a balancing term. The entropy rate favors formation of compact and homogeneous clusters, while the balancing function encourages clusters with similar sizes and penalizes larger clusters that aggressively group samples. We present a novel graph construction for the graph associated with the data and show that this construction induces a matroid--a combinatorial structure that generalizes the concept of linear independence in vector spaces. The clustering result is given by the graph topology that maximizes the objective function under the matroid constraint. By exploiting the submodular and monotonic properties of the objective function, we develop an efficient greedy algorithm. Furthermore, we prove an approximation bound of 1/2 for the optimality of the greedy solution. We validate the proposed algorithm on various benchmarks and show its competitive performances with respect to popular clustering algorithms. We further apply it for the task of superpixel segmentation. Experiments on the Berkeley segmentation data set reveal its superior performances over the state-of-the-art superpixel segmentation algorithms in all the standard evaluation metrics.

[1]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[2]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[3]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[4]  László Lovász,et al.  Submodular functions and convexity , 1982, ISMP.

[5]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[6]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Richard M. Leahy,et al.  An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[10]  David Harel,et al.  On Clustering Using Random Walks , 2001, FSTTCS.

[11]  Jianbo Shi,et al.  A Random Walks View of Spectral Segmentation , 2001, AISTATS.

[12]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[13]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[15]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[17]  Miguel Á. Carreira-Perpiñán,et al.  Proximity Graphs for Clustering and Manifold Learning , 2004, NIPS.

[18]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[20]  Alexei A. Efros,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[21]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[22]  Clustering Using a Random Walk Based Distance Measure , 2005, ESANN.

[23]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[24]  Jeff A. Bilmes,et al.  Q-Clustering , 2005, NIPS.

[25]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[26]  Leo Grady,et al.  Random Walks for Image Segmentation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[28]  Alexei A. Efros,et al.  Recovering Occlusion Boundaries from a Single Image , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[29]  Haibin Ling,et al.  Shape Classification Using the Inner-Distance , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[31]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[32]  Yuichi Taguchi,et al.  Stereo reconstruction with mixed pixels using adaptive over-segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Umar Mohammed,et al.  Superpixel lattices , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Pushmeet Kohli,et al.  Exact inference in multi-label CRFs with higher order cliques , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Pushmeet Kohli,et al.  Robust Higher Order Potentials for Enforcing Label Consistency , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[37]  Robert Krauthgamer,et al.  Partitioning graphs into balanced components , 2009, SODA.

[38]  Ling Huang,et al.  Fast approximate spectral clustering , 2009, KDD.

[39]  Sven J. Dickinson,et al.  TurboPixels: Fast Superpixels Using Geometric Flows , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Sebastian Nowozin,et al.  On Parameter Learning in CRF-Based Approaches to Object Class Image Segmentation , 2010, ECCV.

[41]  Jonathan Warrell,et al.  “Lattice Cut” - Constructing superpixels using layer constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  Fei Wang,et al.  Linear Time Maximum Margin Clustering , 2010, IEEE Transactions on Neural Networks.

[43]  Satoru Iwata,et al.  Minimum Average Cost Clustering , 2010, NIPS.

[44]  Anthony Wirth,et al.  Correlation Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[45]  Paria Mehrani,et al.  Superpixels and Supervoxels in an Energy Optimization Framework , 2010, ECCV.

[46]  Rama Chellappa,et al.  Entropy rate superpixel segmentation , 2011, CVPR 2011.

[47]  Pushmeet Kohli,et al.  Object stereo — Joint stereo matching and object segmentation , 2011, CVPR 2011.

[48]  Hui Lin,et al.  Word Alignment via Submodular Maximization over Matroids , 2011, ACL.

[49]  Jeff A. Bilmes,et al.  Submodularity beyond submodular energies: Coupling edges in graph cuts , 2011, CVPR 2011.

[50]  Sebastian Nowozin,et al.  Information Theoretic Clustering Using Minimum Spanning Trees , 2012, DAGM/OAGM Symposium.

[51]  智一 吉田,et al.  Efficient Graph-Based Image Segmentationを用いた圃場図自動作成手法の検討 , 2014 .