Quantization and clustering with Bregman divergences

This paper deals with the problem of quantization of a random variable X taking values in a separable and reflexive Banach space, and with the related question of clustering independent random observations distributed as X. To this end, we use a quantization scheme with a class of distortion measures called Bregman divergences, and provide conditions ensuring the existence of an optimal quantizer and an empirically optimal quantizer. Rates of convergence are also discussed.

[1]  T. Linder LEARNING-THEORETIC METHODS IN VECTOR QUANTIZATION , 2002 .

[2]  Maya R. Gupta,et al.  Functional Bregman Divergence and Bayesian Estimation of Distributions , 2006, IEEE Transactions on Information Theory.

[3]  Xin Guo,et al.  On the optimality of conditional expectation as a Bregman predictor , 2005, IEEE Trans. Inf. Theory.

[4]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[5]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[6]  Peter L. Bartlett,et al.  Model Selection and Error Estimation , 2000, Machine Learning.

[7]  Heinz H. Bauschke,et al.  ESSENTIAL SMOOTHNESS, ESSENTIAL STRICT CONVEXITY, AND LEGENDRE FUNCTIONS IN BANACH SPACES , 2001 .

[8]  D. Butnariu,et al.  Convergence of Bregman Projection Methods for Solving Consistent Convex Feasibility Problems in Reflexive Banach Spaces , 1997 .

[9]  I. Csiszár Generalized projections for non-negative functions , 1995 .

[10]  H. Brezis Functional Analysis, Sobolev Spaces and Partial Differential Equations , 2010 .

[11]  Frank Nielsen,et al.  Bregman Voronoi Diagrams , 2007, Discret. Comput. Geom..

[12]  Martin J. Wainwright,et al.  Estimating Divergence Functionals and the Likelihood Ratio by Convex Risk Minimization , 2008, IEEE Transactions on Information Theory.

[13]  R. Gray,et al.  Distortion measures for speech processing , 1980 .

[14]  David Pollard,et al.  Quantization and the method of k -means , 1982, IEEE Trans. Inf. Theory.

[15]  Robert M. Gray,et al.  Global convergence and empirical consistency of the generalized Lloyd algorithm , 1986, IEEE Trans. Inf. Theory.

[16]  Maya R. Gupta,et al.  An Introduction to Functional Derivatives , 2008 .

[17]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[18]  Dudley,et al.  Real Analysis and Probability: Measurability: Borel Isomorphism and Analytic Sets , 2002 .

[19]  S. Graf,et al.  Foundations of Quantization for Probability Distributions , 2000 .

[20]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[21]  M. Talagrand,et al.  Probability in Banach spaces , 1991 .

[22]  Luc Devroye,et al.  On the Performance of Clustering in Hilbert Spaces , 2008, IEEE Transactions on Information Theory.

[23]  S. Łojasiewicz,et al.  An introduction to the theory of real functions , 1988 .

[24]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[25]  T. Laloë,et al.  L1-Quantization and clustering in Banach spaces , 2010 .

[26]  James O. Ramsay,et al.  Functional Data Analysis , 2005 .

[27]  Charles L. Byrne,et al.  General entropy criteria for inverse problems, with applications to data compression, pattern classification, and cluster analysis , 1990, IEEE Trans. Inf. Theory.

[28]  W. Arendt Vector-valued laplace transforms and cauchy problems , 2002 .

[29]  Julia,et al.  Vector-valued Laplace Transforms and Cauchy Problems , 2011 .