On the Centroids of Symmetrized Bregman Divergences

In this paper, we generalize the notions of centroids and barycenters to the broad class of information-theoretic distortion measures called Bregman divergences. Bregman divergences are versatile, and unify quadratic geometric distances with various statistical entropic measures. Because Bregman divergences are typically asymmetric, we consider both the left-sided and right-sided centroids and the symmetrized centroids, and prove that all three are unique. We give closed-form solutions for the sided centroids that are generalized means, and design a provably fast and efficient approximation algorithm for the symmetrized centroid based on its exact geometric characterization that requires solely to walk on the geodesic linking the two sided centroids. We report on our generic implementation for computing entropic centers of image clusters and entropic centers of multivariate normals, and compare our results with former ad-hoc methods.

[1]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[2]  Marc Teboulle,et al.  A Unified Continuous Optimization Framework for Center-Based Clustering Methods , 2007, J. Mach. Learn. Res..

[3]  Chin-Hui Lee,et al.  A structural Bayes approach to speaker adaptation , 2001, IEEE Trans. Speech Audio Process..

[4]  M. Basseville,et al.  On entropies, divergences, and mean values , 1995, Proceedings of 1995 IEEE International Symposium on Information Theory.

[5]  Minh N. Do,et al.  Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance , 2002, IEEE Trans. Image Process..

[6]  I. Csiszár Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems , 1991 .

[7]  C. R. Rao,et al.  On the convexity of some divergence measures based on entropy functions , 1982, IEEE Trans. Inf. Theory.

[8]  Inderjit S. Dhillon,et al.  Differential Entropic Clustering of Multivariate Gaussians , 2006, NIPS.

[9]  Jorge Mateu,et al.  Quasi-arithmetic means of covariance functions with potential applications to space-time data , 2006, J. Multivar. Anal..

[10]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[11]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[12]  Frank Nielsen,et al.  Fitting the Smallest Enclosing Bregman Ball , 2005, ECML.

[13]  Mark A. Clements,et al.  A Computationally Compact Divergence Measure for Speech Processing , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[15]  Yannis Stylianou,et al.  Perceptual and objective detection of discontinuities in concatenative speech synthesis , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[16]  R. Sibson Information radius , 1969 .

[17]  Jerry D. Gibson,et al.  COMPARISON OF DISTANCE MEASURES IN DISCRETE SPECTRAL MODELING , 2000 .

[18]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[19]  R. Veldhuis The centroid of the symmetrical Kullback-Leibler distance , 2002, IEEE Signal Processing Letters.

[20]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[21]  R. Jackson Inequalities , 2007, Algebra for Parents.

[22]  Zhizhou Wang,et al.  DTI segmentation using an information theoretic tensor dissimilarity measure , 2005, IEEE Transactions on Medical Imaging.

[23]  O. Barndorff-Nielsen Parametric statistical models and likelihood , 1988 .

[24]  Nancy Reid,et al.  Parametric Statistical Models and Likelihood. , 1990 .

[25]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[26]  Frank K. Soong,et al.  On divergence based clustering of normal distributions and its application to HMM adaptation , 2003, INTERSPEECH.

[27]  Jithendra Vepa,et al.  An Acoustic Model Based on Kullback-Leibler Divergence for Posterior Features , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[28]  Frank Nielsen,et al.  Bregman Voronoi Diagrams , 2007, Discret. Comput. Geom..