Fuzzy clustering based encoding for Visual Object Classification

Nowadays the bag-of-visual-words is a very popular approach to perform the task of Visual Object Classification (VOC). Two key phases of VOC are the vocabulary building step, i.e. the construction of a `visual dictionary' including common codewords in the image corpus, and the assignment step, i.e. the encoding of the images by means of these codewords. Hard assignment of image descriptors to visual codewords is commonly used in both steps. However, as only a single visual word is assigned to a given feature descriptor, hard assignment may hamper the characterization of an image in terms of the distribution of visual words, which may lead to poor classification of the images. Conversely, soft assignment can improve classification results, by taking into account the relevance of the feature descriptor to more than one visual word. Fuzzy Set Theory (FST) is a natural way to accomplish soft assignment. In particular, fuzzy clustering can be well applied within the VOC framework. In this paper we investigate the effects of using the well-known Fuzzy C-means algorithm and its kernelized version to create the visual vocabulary and to perform image encoding. Preliminary results on the Pascal VOC data set show that fuzzy clustering can improve the encoding step of VOC. In particular, the use of KFCM provides better classification results than standard FCM and K-means.

[1]  C.-C.J. Kuo,et al.  Classified vector quantization using fuzzy theory , 1992, [1992 Proceedings] IEEE International Conference on Fuzzy Systems.

[2]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[3]  Rong Jin,et al.  Speedup of fuzzy and possibilistic kernel c-means for large-scale clustering , 2011, 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011).

[4]  Hidetomo Ichihashi,et al.  FCM classifier for high-dimensional data , 2008, 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence).

[5]  Antonio Torralba,et al.  Describing Visual Scenes Using Transformed Objects and Parts , 2008, International Journal of Computer Vision.

[6]  C. V. Jawahar,et al.  Bag of Visual Words: A Soft Clustering Based Exposition , 2011, 2011 Third National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics.

[7]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[8]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[9]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[10]  Jean-Marc Odobez,et al.  A Thousand Words in a Scene , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Azeddine Beghdadi,et al.  Vector quantization for image compression based on fuzzy clustering , 1999, ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359).

[13]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[14]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  George E. Tsekouras,et al.  Fast fuzzy vector quantization , 2010, International Conference on Fuzzy Systems.

[16]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[18]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[19]  Frank Klawonn,et al.  Fuzzy c-means in High Dimensional Spaces , 2011, Int. J. Fuzzy Syst. Appl..

[20]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[21]  Andrew Zisserman,et al.  Scene Classification Using a Hybrid Generative/Discriminative Approach , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[23]  Cor J. Veenman,et al.  Robust Scene Categorization by Learning Image Statistics in Context , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[24]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[26]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[27]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[28]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[29]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..