论文信息 - Context-Based Scene Understanding

Context-Based Scene Understanding

Context plays an important role in performance of object detection. There are two popular considerations in building context models for computer vision applications; type of context semantic, spatial, scale and scope of the relations pairwise, high-order. In this paper, a new unified framework is presented that combines multiple sources of context in high-order relations to encode semantical coherence and consistency of the scenes. This framework introduces a new descriptor called context relevance score to model context-based distribution of the response variables and apply it to two distributions. First model incorporates context descriptor along with annotation response into a supervised Latent Dirichlet Allocation LDA built on multi-variate Bernoulli distribution called Context-Based LDA CBLDA. The second model is based on multi-variate Wallenius' non-central Hyper-geometric distribution and is called Wallenius LDA WLDA. WLDA incorporates context knowledge as bias parameter. Scene context is modeled as a graph and effectively used in object detection framework to maximize semantical consistency of the scene. The graph can also be used in recognition of out-of-context objects. Annotation metadata of Sun397 dataset is used to construct the context model. Performance of the proposed approaches was evaluated on ImageNet dataset. Comparison between proposed approaches and state-of-art multi-class object annotation algorithm shows superiority of presented approach in labeling of scene content.

Borko Furht | Esfandiar Zolghadr | B. Furht | Esfandiar Zolghadr

[1] Lior Wolf,et al. A Critical View of Context , 2006, International Journal of Computer Vision.

[2] Gert R. G. Lanckriet,et al. Multi-class object localization by combining local contextual interactions , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3] Samy Bengio,et al. Using Web Co-occurrence Statistics for Improving Image Categorization , 2013, ArXiv.

[4] Roland Siegwart,et al. BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[5] Antonio Torralba,et al. Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes , 2003, NIPS.

[6] Yue Zhao,et al. Taxonomy augmented object recognition , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[7] Rui Zhang,et al. Contextual Object Detection With Spatial Context Prototypes , 2014, IEEE Transactions on Multimedia.

[8] Antonio Torralba,et al. Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[9] Amit K. Roy-Chowdhury,et al. Context-Aware Modeling and Recognition of Activities in Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Ling Shao,et al. Efficient Search and Localization of Human Actions in Video Databases , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[11] Jing Xiao,et al. Detection Evolution with Multi-order Contextual Co-occurrence , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12] Antonio Torralba,et al. A Tree-Based Context Model for Object Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Shiyong Cui,et al. A Comparative Study of Bag-of-Words and Bag-of-Topics Models of EO Image Patches , 2015, IEEE Geoscience and Remote Sensing Letters.

[14] Dawei Song,et al. Pure High-Order Word Dependence Mining via Information Geometry , 2011, ICTIR.

[15] David G. Lowe,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[16] Ying Wu,et al. Action recognition with multiscale spatio-temporal contexts , 2011, CVPR 2011.

[17] Ping Zhou,et al. A LDA-Based Approach for Semi-Supervised Document Clustering , 2014 .

[18] I. Biederman. Perceiving Real-World Scenes , 1972, Science.

[19] Antonio Torralba,et al. Context models and out-of-context objects , 2012, Pattern Recognit. Lett..

[20] Liyan Zhang,et al. Context-based person identification framework for smart video surveillance , 2013, Machine Vision and Applications.

[21] Farhad Samadzadegan,et al. Object Recognition Based on the Context Aware Decision-Level Fusion in Multiviews Imagery , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[22] Hagai Attias,et al. Supervised topic model for automatic image annotation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23] Heesoo Myeong,et al. Tensor-Based High-Order Semantic Relation Transfer for Semantic Scene Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24] Stephen Gould,et al. Multi-Class Segmentation with Relative Location Prior , 2008, International Journal of Computer Vision.

[25] Charless C. Fowlkes,et al. Discriminative Models for Multi-Class Object Layout , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26] Gary R. Bradski,et al. ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[27] Shiyin Qin,et al. A new method of image classification based on local appearance and context information , 2013, Neurocomputing.

[28] Dale Schuurmans,et al. The latent maximum entropy principle , 2002, Proceedings IEEE International Symposium on Information Theory,.

[29] Gert R. G. Lanckriet,et al. Contextual Object Localization With Multiple Kernel Nearest Neighbor , 2011, IEEE Transactions on Image Processing.

[30] Gabriela Csurka,et al. Visual categorization with bags of keypoints , 2002, eccv 2004.

[31] Chris H. Q. Ding,et al. K-means clustering via principal component analysis , 2004, ICML.

[32] Pushmeet Kohli,et al. Robust Higher Order Potentials for Enforcing Label Consistency , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33] Ling Shao,et al. Unsupervised Spectral Dual Assignment Clustering of Human Actions in Context , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34] Jung-Hyun Lee,et al. Ontology-based inference system for adaptive object recognition , 2013, Multimedia Tools and Applications.

[35] J. Chesson. A non-central multivariate hypergeometric distribution arising from biased sampling with application to selective predation , 1976 .

[36] Antonio Criminisi,et al. TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[37] Agner Fog,et al. Calculation Methods for Wallenius' Noncentral Hypergeometric Distribution , 2008, Commun. Stat. Simul. Comput..

[38] Luc Van Gool,et al. Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[39] Serge J. Belongie,et al. Context based object categorization: A critical survey , 2010, Comput. Vis. Image Underst..

[40] Alexei A. Efros,et al. Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[41] Ramón Moreno,et al. A machine learning based intelligent vision system for autonomous object detection and recognition , 2013, Applied Intelligence.