Efficient match kernel in fine-grained image categorization

In this paper, we study the problem of fine-grained image categorization, which is much more useful in real applications than basic image classification. Based on the most challenge dataset, CUB-200, we combine Efficient match kernel (EMK) with the weighted spatial pyramid to achieve state-of-art performance. Comparison with BoW, which can also be viewed as kernel matching approach, EMK digs the relations among vocabulary bases and finds a new mapping in kernel framework. By it, local features are mapped to a low dimensional feature space and average the resulting vectors to form a set level feature in EMK. It is proved that it is helpful to improve the system performance.

[1]  Siwei Lyu,et al.  Mercer kernels for object recognition with local features , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  Andrew Zisserman,et al.  A Visual Vocabulary for Flower Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Fei-Fei Li,et al.  Combining randomization and discrimination for fine-grained image categorization , 2011, CVPR 2011.

[4]  Thomas G. Dietterich,et al.  Dictionary-free categorization of very similar objects via stacked evidence trees , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[6]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Cristian Sminchisescu,et al.  Efficient Match Kernel between Sets of Features for Visual Recognition , 2009, NIPS.

[9]  Pietro Perona,et al.  Visual Recognition with Humans in the Loop , 2010, ECCV.

[10]  Barbara Caputo,et al.  Recognition with local features: the kernel recipe , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Fei-Fei Li,et al.  What Does Classifying More Than 10, 000 Image Categories Tell Us? , 2010, ECCV.

[13]  Pietro Perona,et al.  Online crowdsourcing: Rating annotators and obtaining cost-effective labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[14]  Gary R. Bradski,et al.  A codebook-free and annotation-free approach for fine-grained image categorization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.