Improving object classification using semantic attributes

This paper shows how semantic attribute features can be used to improve object classification performance. The semantic attributes used fall into five groups: scene (e.g. ‘road’), colour (e.g. ‘green’), part (e.g. ‘face’), shape (e.g. ‘box’), and material (e.g. ‘wood’). We train classifiers from representative images for 60 semantic attributes. We first assess the accuracy of the individual classifiers, and show that they can be used to predict semantic annotations for test images. We then use output from the set of trained classifiers to create a new low-dimensional image representation. Experiments on data from the PASCAL VOC challenge show that the semantic attribute features achieve an object classification performance close to that of high-dimensional bag-of-words features, and that using a combination of semantic attribute features and bag-of-words features gives a better classification performance than using either feature set alone.

[1]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[2]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[5]  Cordelia Schmid,et al.  Learning Color Names from Real-World Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Cordelia Schmid,et al.  Learning Object Representations for Visual Object Class Recognition , 2007, ICCV 2007.

[7]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[9]  Trevor Darrell,et al.  Learning Visual Representations using Images with Captions , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[11]  Jitendra Malik,et al.  Context by region ancestry , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[13]  Arnold W. M. Smeulders,et al.  What is the spatial extent of an object? , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Bernt Schiele,et al.  Natural Scene Retrieval Based on a Semantic Modeling Step , 2004, CIVR.

[15]  Andrew Zisserman,et al.  Learning Visual Attributes , 2007, NIPS.

[16]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).