Object coding on the semantic graph for scene classification

In the scene classification, a scene can be considered as a set of object cliques. Objects inside each clique have semantic correlations with each other, while two objects from different cliques are relatively independent. To utilize these correlations for better recognition performance, we propose a new method - Object Coding on the Semantic Graph to address the scene classification problem. We first exploit prior knowledge by making statistics on a large number of labeled images and calculating the dependency degree between objects. Then, a graph is built to model the semantic correlations between objects. This semantic graph captures semantics by treating the objects as vertices and the objects affinities as the weights of edges. By encoding this semantic knowledge into the semantic graph, object coding is conducted to automatically select a set of object cliques that have strongly semantic correlations to represent a specific scene. The experimental results show that the Object Coding on semantic graph can improve the classification accuracy.

[1]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  Fei-Fei Li,et al.  What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  Yueting Zhuang,et al.  Sparse Unsupervised Dimensionality Reduction for Multiple View Data , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[5]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[6]  Qi Tian,et al.  Correlated attribute transfer with multi-task graph-guided fusion , 2012, ACM Multimedia.

[7]  Julien Mairal,et al.  Supervised feature selection in graphs with path coding penalties and network flows , 2012, J. Mach. Learn. Res..

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  Qi Tian,et al.  Image Annotation by Input–Output Structural Grouping Sparsity , 2012, IEEE Transactions on Image Processing.

[10]  Yi Yang,et al.  Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval , 2008, IEEE Transactions on Multimedia.

[11]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[12]  Dawei Song,et al.  Pure High-Order Word Dependence Mining via Information Geometry , 2011, ICTIR.

[13]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[14]  Hao Su,et al.  Objects as Attributes for Scene Classification , 2010, ECCV Workshops.

[15]  Mubarak Shah,et al.  Recognizing Complex Events Using Large Margin Joint Low-Level Event Model , 2012, ECCV.