Bayesian Nonparametric Clustering for Positive Definite Matrices

Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms.

[1]  U. Fano Description of States in Quantum Mechanics by Density Matrix and Operator Techniques , 1957 .

[2]  Zhihua Zhang,et al.  Matrix-Variate Dirichlet Process Mixture Models , 2010, AISTATS.

[3]  Silke Wagner,et al.  Comparing Clusterings - An Overview , 2007 .

[4]  Brian C. Lovell,et al.  Sparse Coding and Dictionary Learning for Symmetric Positive Definite Matrices: A Kernel Approach , 2012, ECCV.

[5]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Mehrtash Tafazzoli Harandi,et al.  From Manifold to Manifold: Geometry-Aware Dimensionality Reduction for SPD Matrices , 2014, ECCV.

[7]  Kenneth I. Laws,et al.  Rapid Texture Identification , 1980, Optics & Photonics.

[8]  W. Förstner,et al.  A Metric for Covariance Matrices , 2003 .

[9]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[10]  Larry S. Davis,et al.  Learning Discriminative Appearance-Based Models Using Partial Least Squares , 2009, 2009 XXII Brazilian Symposium on Computer Graphics and Image Processing.

[11]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[12]  René Vidal,et al.  Clustering and dimensionality reduction on Riemannian manifolds , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Sullivan Hidot,et al.  An Expectation-Maximization algorithm for the Wishart mixture model: Application to movement clustering , 2010, Pattern Recognit. Lett..

[14]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[15]  Xavier Pennec,et al.  A Riemannian Framework for Tensor Computing , 2005, International Journal of Computer Vision.

[16]  Laurent Ferro-Famil,et al.  Unsupervised classification of multifrequency and fully polarimetric SAR images based on the H/A/Alpha-Wishart classifier , 2001, IEEE Trans. Geosci. Remote. Sens..

[17]  Xuelong Li,et al.  Gabor-Based Region Covariance Matrices for Face Recognition , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Antonio Torralba,et al.  Describing Visual Scenes using Transformed Dirichlet Processes , 2005, NIPS.

[19]  B. D. Finetti,et al.  Probability, induction and statistics : the art of guessing , 1979 .

[20]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[21]  Takashi Masuko,et al.  Covariance clustering on Riemannian manifolds for acoustic model compression , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  M. West,et al.  Hyperparameter estimation in Dirichlet process mixture models , 1992 .

[23]  Bingpeng Ma,et al.  BiCov: a novel image representation for person re-identification and face verification , 2012, BMVC.

[24]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[25]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[26]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[27]  Michael I. Jordan,et al.  Hierarchical Bayesian Nonparametric Models with Applications , 2008 .

[28]  Nizar Bouguila,et al.  A Dirichlet Process Mixture of Generalized Dirichlet Distributions for Proportional Data Modeling , 2010, IEEE Transactions on Neural Networks.

[29]  Carl E. Rasmussen,et al.  Factorial Hidden Markov Models , 1997 .

[30]  Vassilios Morellas,et al.  Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications , 2011, CVPR 2011.

[31]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[32]  Aarnout Brombacher,et al.  Probability... , 2009, Qual. Reliab. Eng. Int..

[33]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[34]  Bernt Schiele,et al.  Analyzing appearance and contour based methods for object categorization , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[35]  Inderjit S. Dhillon,et al.  Differential Entropic Clustering of Multivariate Gaussians , 2006, NIPS.

[36]  Yuwei Wu,et al.  Affine Object Tracking Using Kernel-Based Region Covariance Descriptors , 2011 .

[37]  Michael I. Jordan,et al.  Bayesian Nonparametrics: Hierarchical Bayesian nonparametric models with applications , 2010 .

[38]  Michael I. Jordan,et al.  Learning Multiscale Representations of Natural Scenes Using Dirichlet Processes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[39]  Dario Bini,et al.  Computing the Karcher mean of symmetric positive definite matrices , 2013 .

[40]  N. Ayache,et al.  Log‐Euclidean metrics for fast and simple calculus on diffusion tensors , 2006, Magnetic resonance in medicine.

[41]  Fatih Murat Porikli,et al.  Covariance Tracking using Model Update Based on Lie Algebra , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[42]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[43]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[44]  Vittorio Murino,et al.  Multi-class Classification on Riemannian Manifolds for Video Surveillance , 2010, ECCV.

[45]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[46]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[47]  Roded Sharan,et al.  Bayesian haplo-type inference via the dirichlet process , 2004, ICML.

[48]  A. Kshirsagar Bartlett Decomposition and Wishart Distribution , 1959 .

[49]  Hongdong Li,et al.  Kernel Methods on the Riemannian Manifold of Symmetric Positive Definite Matrices , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Jun S. Liu,et al.  The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem , 1994 .

[51]  Peter Meer,et al.  Nonlinear Mean Shift over Riemannian Manifolds , 2009, International Journal of Computer Vision.

[52]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[53]  Fei-Fei Li,et al.  OPTIMOL: Automatic Online Picture Collection via Incremental Model Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Inderjit S. Dhillon,et al.  Matrix Nearness Problems with Bregman Divergences , 2007, SIAM J. Matrix Anal. Appl..

[55]  Arnaud Doucet,et al.  Bayesian Inference for Linear Dynamic Models With Dirichlet Process Mixtures , 2007, IEEE Transactions on Signal Processing.

[56]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[57]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[58]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[59]  James C. Gee,et al.  Spatial transformations of diffusion tensor magnetic resonance images , 2001, IEEE Transactions on Medical Imaging.

[60]  Yee Whye Teh,et al.  Collapsed Variational Dirichlet Process Mixture Models , 2007, IJCAI.

[61]  Bruno Pelletier Kernel density estimation on Riemannian manifolds , 2005 .

[62]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[63]  W. Eric L. Grimson,et al.  Tractography Segmentation Using a Hierarchical Dirichlet Processes Mixture Model , 2009, IPMI.

[64]  Michael I. Jordan,et al.  Probabilistic grammars and hierarchical Dirichlet processes , 2018, Oxford Handbooks Online.

[65]  Vassilios Morellas,et al.  Tensor Sparse Coding for Region Covariances , 2010, ECCV.

[66]  Daniela Rodriguez,et al.  Kernel Density Estimation on Riemannian Manifolds: Asymptotic Results , 2009, Journal of Mathematical Imaging and Vision.

[67]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[68]  T. W. Anderson,et al.  An Introduction to Multivariate Statistical Analysis , 1959 .

[69]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[70]  T. Ferguson BAYESIAN DENSITY ESTIMATION BY MIXTURES OF NORMAL DISTRIBUTIONS , 1983 .

[71]  Janusz Konrad,et al.  Action Recognition Using Sparse Representation on Covariance Manifolds of Optical Flow , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[72]  Rachid Deriche,et al.  Texture and color segmentation based on the combined use of the structure tensor and the image components , 2008, Signal Process..

[73]  M. L. Eaton Multivariate statistics : a vector space approach , 1985 .

[74]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[75]  Jason A. Duan,et al.  Modeling Disease Incidence Data with Spatial and Spatio Temporal Dirichlet Process Mixtures , 2008, Biometrical journal. Biometrische Zeitschrift.

[76]  Larry S. Davis,et al.  Covariance discriminative learning: A natural and efficient approach to image set classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[77]  Erik B. Sudderth Graphical models for visual object recognition and tracking , 2006 .

[78]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.