Improved minimax bounds on the test and training distortion of empirically designed vector quantizers

It has been shown by earlier results that the minimax expected (test) distortion redundancy of empirical vector quantizers with three or more levels designed from n independent and identically distributed (i.i.d.) data points is at least Omega(1/radicn) for the class of distributions on a bounded set. In this correspondence, a much simpler construction and proof for this are given with much better constants. There are similar bounds for the training distortion of the empirically optimal vector quantizer with three or more levels. These rates, however, do not hold for a one-level quantizer. Here, the two-level quantizer case is clarified, showing that it already shares the behavior of the general case. Given that the minimax bounds are proved using a construction that involves discrete distributions, one suspects that for the class of distributions with uniformly bounded continuous densities, the expected distortion redundancy might decrease as o(1/radicn) uniformly. It is shown as well that this is not so, proving that the lower bound for the expected test distortion remains true for these subclasses

[1]  E. Slud Distribution Inequalities for the Binomial Law , 1977 .

[2]  T. Linder LEARNING-THEORETIC METHODS IN VECTOR QUANTIZATION , 2002 .

[3]  András Antos Improved Minimax Bounds on the Test and Training Distortion of Empirically Designed Vector Quantizers , 2005, COLT.

[4]  László Györfi,et al.  Individual convergence rates in empirical vector quantizer design , 2005, IEEE Transactions on Information Theory.

[5]  Andrew R. Barron,et al.  Complexity Regularization with Application to Artificial Neural Networks , 1991 .

[6]  Alexander V. Trushkin Sufficient conditions for uniqueness of a locally optimal quantizer for a class of convex error weighting functions , 1982, IEEE Trans. Inf. Theory.

[7]  A. Gyorgy,et al.  Improved convergence rates in empirical vector quantizer design , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[8]  David A. Cohn,et al.  Theory and Practice of Vector Quantizers Trained on Small Training Sets , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[10]  Tamás Linder,et al.  The minimax distortion redundancy in empirical quantizer design , 1997, Proceedings of IEEE International Symposium on Information Theory.

[11]  D. Pollard Strong Consistency of $K$-Means Clustering , 1981 .

[12]  Neri Merhav,et al.  On the amount of statistical side information required for lossy data compression , 1997, IEEE Trans. Inf. Theory.

[13]  S. M. Perlmutter,et al.  Training sequence size and vector quantizer performance , 1991, [1991] Conference Record of the Twenty-Fifth Asilomar Conference on Signals, Systems & Computers.

[14]  Tamás Linder On the training distortion of vector quantizers , 2000, IEEE Trans. Inf. Theory.

[15]  Tamás Linder,et al.  Empirical quantizer design in the presence of source noise or channel noise , 1997, IEEE Trans. Inf. Theory.

[16]  C. Mallows A Note on Asymptotic Joint Normality , 1972 .

[17]  Peter L. Bartlett,et al.  Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.

[18]  P. Chou The distortion of vector quantizers trained on n vectors decreases to the optimum as O/sub p/(1/n) , 1994, Proceedings of 1994 IEEE International Symposium on Information Theory.

[19]  L. Devroye,et al.  Nonparametric Density Estimation: The L 1 View. , 1985 .

[20]  David Pollard,et al.  Quantization and the method of k -means , 1982, IEEE Trans. Inf. Theory.

[21]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[22]  A. Kolmogorov,et al.  Entropy and "-capacity of sets in func-tional spaces , 1961 .

[23]  Assaf J. Zeevi On the performance of vector quantizers empirically designed from dependent sources , 1998, Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225).

[24]  D. Pollard A Central Limit Theorem for $k$-Means Clustering , 1982 .

[25]  Michael Kohler,et al.  A bound on the expected maximal deviation of averages from their means , 2003 .

[26]  J. Hartigan Asymptotic Distributions for Clustering Criteria , 1978 .