Joint Equal Contribution of Global and Local Features for Image Annotation

Image annotation is a very important task as the number of photographs has gone sky-high. This paper describes our participation in the ImageCLEF Large Scale Visual Concept Detection and Annotation Task 2009. We present the method used for our best run. Our approach is inspired from a recently proposed method where joint equal contribution (JEC) of simple global color and texture features can outperform the state-of-the-art annotation techniques [10]. Our idea is that if such simple features could do so well, then the combination of higher-level features would do even better. Study has shown that the concurrent use of saliency and gist of the scene is a major trait of human vision system. Therefore, in this preliminary study, we propose to explore the combination of different visual features at global, local and scene levels including global and local color, texture, and gist of the scene. The experiments confirm that higher-level features lead to better performance. Through the experiments, we also found that using 40 nearest neighbors and HSV, HSV (at saliency regions), HAAR, GIST (full scene), GIST (scene at the center) as features produce the best result.We finally identify the weakness in our approach and ways on how the system could be optimized and improved.

[1]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[2]  Masashi Inoue,et al.  Image retrieval: Research and use in the information explosion , 2009 .

[3]  M. Potter Short-term conceptual memory for pictures. , 1976, Journal of experimental psychology. Human learning and memory.

[4]  A. Friedman Framing pictures: the role of knowledge in automatized encoding and memory for gist. , 1979, Journal of experimental psychology. General.

[5]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[6]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[7]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[8]  Stefanie Nowak,et al.  Overview of the CLEF 2009 Large Scale - Visual Concept Detection and Annotation Task , 2009, CLEF.

[9]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[10]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.