Semantic Contexts and Fisher Vectors for the ImageCLEF 2011 Photo Annotation Task

This paper describes the participation of UNI- CAEN/GREYC to the ImageCLEF 2011 photo annotation task. The proposed approach uses visual image features and binary annota- tions of concepts only. In this approach, the annotations are predicted by SVM classiers trained separately for each concept. The classiers take Bag-of-Words histograms and sher vectors representations as inputs, both being combined at the decision level. Furthermore, contextual information is also embedded into the Bag-of-Words histograms to enhance their performance. The experimental results show that the combination of Bag-of-Words histograms and Fisher vectors brings signicant performance increase (e.g. 4% for Mean Average Precision). Furthermore, the results of our best-run rank in top 3 for both concept and image level evaluations.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  J. Canny A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  R. Hunter Photoelectric Color Difference Meter , 1958 .

[4]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Andrew Zisserman,et al.  A Statistical Approach to Texture Classification from Single Images , 2004, International Journal of Computer Vision.

[8]  Bernt Schiele,et al.  International Journal of Computer Vision manuscript No. (will be inserted by the editor) Semantic Modeling of Natural Scenes for Content-Based Image Retrieval , 2022 .

[9]  Stefanie Nowak,et al.  The CLEF 2011 Photo Annotation and Concept-based Retrieval Tasks , 2011, CLEF.

[10]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[12]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[13]  Frédéric Jurie,et al.  Visual word disambiguation by semantic contexts , 2011, 2011 International Conference on Computer Vision.

[14]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.