论文信息 - A similarity measure between unordered vector sets with application to image categorization

A similarity measure between unordered vector sets with application to image categorization

We present a novel approach to compute the similarity between two unordered variable-sized vector sets. To solve this problem, several authors have proposed to model each vector set with a Gaussian mixture model (GMM) and to compute a probabilistic measure of similarity between the GMMs. The main contribution of this paper is to model each vector set with a GMM adapted from a common ldquouniversalrdquo GMM using the maximum a posteriori (MAP) criterion. The advantages of this approach are twofold. MAP provides a more accurate estimate of the GMM parameters compared to standard maximum likelihood estimation (MLE) in the challenging case where the cardinality of the vector set is small. Moreover, there is a correspondence between the Gaussians of two GMMs adapted from a common distribution and one can take advantage of this fact to compute efficiently the probabilistic similarity. This work is applied to the image categorization problem: images are modeled as bags of low-level features and classification is performed using a kernel classifier based on the proposed similarity measure. Experimental results on the PASCAL VOC 2006 and VOC 2007 databases show the excellent performance of our approach.

Florent Perronnin | Yan Liu | F. Perronnin | Yan Liu

[1] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2] Chin-Hui Lee,et al. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[3] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[4] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[5] Jeff A. Bilmes,et al. A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[6] Roland Kuhn,et al. Rapid speaker adaptation in eigenvoice space , 2000, IEEE Trans. Speech Audio Process..

[7] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[8] Mark J. F. Gales. Cluster adaptive training of hidden Markov models , 2000, IEEE Trans. Speech Audio Process..

[9] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10] Nuno Vasconcelos,et al. A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications , 2003, NIPS.

[11] Shiri Gordon,et al. An efficient image similarity measure based on approximations of KL-divergence between two gaussian mixtures , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12] R. Kondor,et al. Bhattacharyya and Expected Likelihood Kernels , 2003 .

[13] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[14] Nuno Vasconcelos,et al. The Kullback-Leibler Kernel as a Framework for Discriminant and Localized Representations for Visual Recognition , 2004, ECCV.

[15] David G. Lowe,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[16] Gabriela Csurka,et al. Visual categorization with bags of keypoints , 2002, eccv 2004.

[17] Nuno Vasconcelos,et al. On the efficient evaluation of probabilistic similarity functions for image retrieval , 2004, IEEE Transactions on Information Theory.

[18] Tony Jebara,et al. Probability Product Kernels , 2004, J. Mach. Learn. Res..

[19] S. Lazebnik,et al. Local Features and Kernels for Classification of Texture and Object Categories: An In-Depth Study , 2005 .

[20] Trevor Darrell,et al. The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21] Cordelia Schmid,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[22] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[23] Steve Young,et al. The HTK book version 3.4 , 2006 .

[24] Trevor Darrell,et al. Approximate Correspondences in High Dimensions , 2006, NIPS.

[25] Gabriela Csurka,et al. Adapted Vocabularies for Generic Visual Categorization , 2006, ECCV.