Alpha-Divergence for Classification, Indexing and Retrieval (Revised 2)

Motivated by Chernoff’s bound on asymptotic probability of error we propose the alpha-divergence measure and a surrogate, the alpha-Jensen difference, for feature classification, indexing and retrieval in image and other databases. The alpha-divergence, also known as Renyi divergence, is a generalization of the Kullback-Liebler divergence and the Hellinger affinity between the probability density characterizing image features of the query and the density characterizing features of candidates in the database. As in any divergence-based classification problem, the alphadivergence must be estimated from the query or reference object and the objects in the database. The surrogate for the alpha-divergence, called the alpha-Jensen difference, can be simply estimated using non-parametric estimation of the joint alpha-entropy of the merged pairs of feature vectors. Two methods of alpha-entropy estimation are investigated: (1) indirect methods based on parametric or non-parametric density estimation over feature space; and (2) direct methods based on combinatorial optimization of minimal spanning trees or other continuous quasi-additive graphs over feature space. We illustrate these results for estimation of dependency in the plane and geo-registration of images.

[1]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[2]  J. Beardwood,et al.  The shortest path through many points , 1959, Mathematical Proceedings of the Cambridge Philosophical Society.

[3]  A. Rényi On Measures of Entropy and Information , 1961 .

[4]  Oldrich A Vasicek,et al.  A Test for Normality Based on Sample Entropy , 1976 .

[5]  Ibrahim A. Ahmad,et al.  A nonparametric estimation of the entropy for absolutely continuous distributions (Corresp.) , 1976, IEEE Trans. Inf. Theory.

[6]  R. Beran Minimum Hellinger distance estimates for parametric models , 1977 .

[7]  Allen Gersho,et al.  Asymptotically optimal block quantization , 1979, IEEE Trans. Inf. Theory.

[8]  Anil K. Jain,et al.  A test of randomness based on the minimal spanning tree , 1983, Pattern Recognit. Lett..

[9]  P. Hall On powerful distributional tests based on sample spacings , 1986 .

[10]  L. Györfi,et al.  Density-free convergence properties of various estimators of entropy , 1987 .

[11]  D. Donoho One-sided inference about functionals of a density , 1988 .

[12]  R. DeVore,et al.  Interpolation of Besov-Spaces , 1988 .

[13]  H. Joe Estimation of entropy and other functionals of a multivariate density , 1989 .

[14]  M. Basseville Distance measures for signal processing and pattern recognition , 1989 .

[15]  Abdelkader Mokkadem,et al.  Estimation of the entropy and information of absolutely continuous random variables , 1989, IEEE Trans. Inf. Theory.

[16]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[17]  P. Hall,et al.  On the estimation of entropy , 1993 .

[18]  A. Tsybakov,et al.  Minimax theory of image reconstruction , 1993 .

[19]  Richard Baraniuk,et al.  Time-frequency based distance and divergence measures , 1994, Proceedings of IEEE-SP International Symposium on Time- Frequency and Time-Scale Analysis.

[20]  I. Johnstone,et al.  Wavelet Shrinkage: Asymptopia? , 1995 .

[21]  Paul A. Viola,et al.  Alignment by Maximization of Mutual Information , 1995, Proceedings of IEEE International Conference on Computer Vision.

[22]  I. Johnstone,et al.  Adapting to Unknown Smoothness via Wavelet Shrinkage , 1995 .

[23]  David L. Neuhoff,et al.  On the asymptotic distribution of the errors in vector quantization , 1996, IEEE Trans. Inf. Theory.

[24]  David L. Neuhoff On the Asymptotic Distribution of , 1996 .

[25]  L. Györfi,et al.  Nonparametric entropy estimation. An overview , 1997 .

[26]  Josiane Zerubia,et al.  The two-dimensional Wold decomposition for segmentation and indexing in image libraries , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[27]  Amir Dembo,et al.  Large Deviations Techniques and Applications , 1998 .

[28]  J. Yukich Probability theory of classical Euclidean optimization problems , 1998 .

[29]  Alfred O. Hero,et al.  Robust entropy estimation strategies based on edge weighted random graphs , 1998, Optics & Photonics.

[30]  Nuno Vasconcelos,et al.  A Bayesian framework for content-based indexing and retrieval , 1998, Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225).

[31]  Josiane Zerubia,et al.  Image retrieval and indexing: a hierarchical approach in computing the distance between textured images , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[32]  Kannan Ramchandran,et al.  Performance Bounds for ATR based on Compressed Data , 1998 .

[33]  Nuno Vasconcelos,et al.  Bayesian representations and learning mechanisms for content-based image retrieval , 1999, Electronic Imaging.

[34]  A. Hero,et al.  Estimation of Renyi information divergence via pruned minimal spanning trees , 1999, Proceedings of the IEEE Signal Processing Workshop on Higher-Order Statistics. SPW-HOS '99.

[35]  Alfred O. Hero,et al.  Asymptotic theory of greedy approximations to minimal k-point random graphs , 1999, IEEE Trans. Inf. Theory.

[36]  I. Johnstone,et al.  ASYMPTOTIC MINIMAXITY OF WAVELET ESTIMATORS WITH SAMPLED DATA , 1999 .

[37]  Minh N. Do,et al.  Texture similarity measurement using Kullback-Leibler distance on wavelet subbands , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[38]  Alfred O. Hero,et al.  Image registration with minimum spanning tree algorithm , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[39]  H. Krim,et al.  An information divergence measure for ISAR image registration , 2001, Proceedings of the 11th IEEE Signal Processing Workshop on Statistical Signal Processing (Cat. No.01TH8563).

[40]  Alfred O. Hero,et al.  Feature coincidence trees for registration of ultrasound breast images , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[41]  Alfred O. Hero,et al.  Convergence rates of minimal graphs with random vertices , 2002 .

[42]  Kannan Ramchandran,et al.  Information-Theoretic Bounds on Target Recognition Performance Based on Degraded Image Data , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Max A. Viergever,et al.  f-information measures in medical image registration , 2004, IEEE Transactions on Medical Imaging.