SUN attribute database: Discovering, annotating, and recognizing scene attributes

In this paper we present the first large-scale scene attribute database. First, we perform crowd-sourced human studies to find a taxonomy of 102 discriminative attributes. Next, we build the “SUN attribute database” on top of the diverse SUN categorical database. Our attribute database spans more than 700 categories and 14,000 images and has potential for use in high-level scene understanding and fine-grained scene recognition. We use our dataset to train attribute classifiers and evaluate how well these relatively simple classifiers can recognize a variety of attributes related to materials, surface properties, lighting, functions and affordances, and spatial envelope properties.

[1]  Antonio Torralba,et al.  Scene-Centered Description from Spatial Envelope Properties , 2002, Biologically Motivated Computer Vision.

[2]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[3]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[5]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[6]  David A. Forsyth,et al.  Utility data annotation with Amazon Mechanical Turk , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[7]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[9]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Michelle R. Greene,et al.  Recognition of natural scenes from global properties: Seeing the forest without representing the trees , 2009, Cognitive Psychology.

[12]  Ali Farhadi,et al.  The benefits and challenges of collecting richer object annotations , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[13]  Ali Farhadi,et al.  Attribute-centric recognition for cross-category generalization , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Frédéric Jurie,et al.  Improving object classification using semantic attributes , 2010, BMVC.

[15]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Alexander C. Berg,et al.  Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.

[17]  Kristen Grauman,et al.  Relative attributes , 2011, 2011 International Conference on Computer Vision.

[18]  Silvio Savarese,et al.  Recognizing human actions by attributes , 2011, CVPR 2011.

[19]  Leonidas J. Guibas,et al.  Human action recognition by learning bases of action attributes and parts , 2011, 2011 International Conference on Computer Vision.

[20]  Kristen Grauman,et al.  Interactively building a discriminative vocabulary of nameable attributes , 2011, CVPR 2011.

[21]  Krista A. Ehinger,et al.  Estimating scene typicality from human ratings and image features , 2011, CogSci.

[22]  David L. Chen and William B. Dolan,et al.  Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection , 2011 .

[23]  Rob Miller,et al.  Real-time crowd control of existing interfaces , 2011, UIST.