Enhancing Semantic Features with Compositional Analysis for Scene Recognition

Scene recognition systems are generally based on features that represent the image semantics by modeling the content depicted in a given image. In this paper we propose a framework for scene recognition that goes beyond the mere visual content analysis by exploiting a new cue for categorization: the image composition, namely its photographic style and layout. We extract information about the image composition by storing the values of affective, aesthetic and artistic features in a compositional vector. We verify the discriminative power of our compositional vector for scene categorization by using it for the classification of images from various, diverse, large scale scene understanding datasets. We then combine the compositional features with traditional semantic features in a complete scene recognition framework. Results show that, due to the complementarity of compositional and semantic features, scene categorization systems indeed benefit from the incorporation of descriptors representing the image photographic layout (+ 13-15% over semantic-only categorization).

[1]  Linda G. Shapiro,et al.  Computer and Robot Vision , 1991 .

[2]  David Wettergreen,et al.  Aesthetic Image Classification for Autonomous Agents , 2010, 2010 20th International Conference on Pattern Recognition.

[3]  Bernard Mérialdo,et al.  Saliency moments for image categorization , 2011, ICMR.

[4]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[6]  Jan C. van Gemert,et al.  Exploiting photographic style for category-level image classification by generalizing the spatial pyramid , 2011, ICMR.

[7]  Hervé Glotin,et al.  IRIM at TRECVID 2014: Semantic Indexing and Instance Search , 2014, TRECVID.

[8]  Wei-Ning Wang,et al.  Image emotional semantic query based on color semantic description , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[9]  Mateu Sbert,et al.  Conceptualizing Birkhoff's Aesthetic Measure Using Shannon Entropy and Kolmogorov Complexity , 2007, CAe.

[10]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[11]  Michael Freeman,et al.  The Photographer's Eye: Composition and Design for Better Digital Photos , 2007 .

[12]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[13]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[14]  Nuria Oliver,et al.  Towards Category-Based Aesthetic Models of Photographs , 2012, MMM.

[15]  Ramesh C. Jain,et al.  Annotation of paintings with high-level semantic concepts using transductive inference and ontology-based concept disambiguation , 2007, ACM Multimedia.

[16]  W. Chu Studying Aesthetics in Photographic Images Using a Computational Approach , 2013 .

[17]  Allan Hanbury,et al.  Affective image classification using features inspired by psychology and art theory , 2010, ACM Multimedia.

[18]  Matthieu Guillaumin,et al.  Combining Image-Level and Segment-Level Models for Automatic Annotation , 2012, MMM.

[19]  C. Won,et al.  Efficient Use of MPEG‐7 Edge Histogram Descriptor , 2002 .

[20]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Axel Pinz,et al.  Computer Vision – ECCV 2006 , 2006, Lecture Notes in Computer Science.

[22]  Tsuhan Chen,et al.  > Replace This Line with Your Paper Identification Number (double-click Here to Edit) < , 2022 .

[23]  D. Ruderman The statistics of natural images , 1994 .

[24]  Albert A. Michelson,et al.  Studies in Optics , 1995 .

[25]  Vicente Ordonez,et al.  High level describable attributes for predicting aesthetics and interestingness , 2011, CVPR 2011.

[26]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[27]  Bernard Mérialdo,et al.  Saliency-aware color moments features for image categorization and retrieval , 2011, 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI).

[28]  Kok-Lim Low,et al.  Saliency-enhanced image aesthetics class prediction , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[29]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).