Scene Understanding Using Context-based Conditional Random Field

In this paper, a new framework for scene understanding using multi-modal high-ordered context-model is introduced. Spatial and semantical interactions are considered as sources of context which are incorporated in the model using a single object-scene relevance measure that quantifies high-order object relations. This score is used to minimize semantical inconsistencies among objects in dense graph representation of the scene category during the object recognition process. New context model is later incorporated in a Conditional Random Fields (CRF) framework to combine contextual cues with appearance descriptors in order to increase object localization and class prediction accuracy. A novel context-based non-central hypergeometric unary potential is defined to maximize the semantical coherence in the scene. Further refinement is performed using context-based pairwise and high-order potentials which use alpha-expansion and graph-cut to find optimal configuration. Comparison between the purposed approach and state-of-art algorithms shows effectiveness of this approach in annotation and interpretation of scenes.

[1]  Antonio Torralba,et al.  Context models and out-of-context objects , 2012, Pattern Recognit. Lett..

[2]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[3]  Pushmeet Kohli,et al.  Robust Higher Order Potentials for Enforcing Label Consistency , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Vincent Charvillat,et al.  Context modeling in computer vision: techniques, implications, and applications , 2010, Multimedia Tools and Applications.

[5]  Antonio Torralba,et al.  Contextual Models for Object Detection Using Boosted Random Fields , 2004, NIPS.

[6]  Borko Furht,et al.  Context-Based Scene Understanding , 2016, Int. J. Multim. Data Eng. Manag..

[7]  Antonio Torralba,et al.  A Tree-Based Context Model for Object Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Serge J. Belongie,et al.  Context based object categorization: A critical survey , 2010, Comput. Vis. Image Underst..

[9]  Lior Wolf,et al.  A Critical View of Context , 2006, International Journal of Computer Vision.

[10]  Jing Xiao,et al.  Detection Evolution with Multi-order Contextual Co-occurrence , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[12]  Ling Shao,et al.  Efficient Search and Localization of Human Actions in Video Databases , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Pietro Perona,et al.  Mutual Boosting for Contextual Inference , 2003, NIPS.

[14]  Ying Wu,et al.  Action recognition with multiscale spatio-temporal contexts , 2011, CVPR 2011.

[15]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[16]  R. Mooney,et al.  Impact of Similarity Measures on Web-page Clustering , 2000 .

[17]  Liyan Zhang,et al.  Context-based person identification framework for smart video surveillance , 2013, Machine Vision and Applications.

[18]  Dawei Song,et al.  Pure High-Order Word Dependence Mining via Information Geometry , 2011, ICTIR.

[19]  Antonio Torralba,et al.  Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes , 2003, NIPS.

[20]  Ling Shao,et al.  Unsupervised Spectral Dual Assignment Clustering of Human Actions in Context , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Gert R. G. Lanckriet,et al.  Multi-class object localization by combining local contextual interactions , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Heesoo Myeong,et al.  Tensor-Based High-Order Semantic Relation Transfer for Semantic Scene Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Ling Shao,et al.  Robust point pattern matching based on spectral context , 2014, Pattern Recognit..

[25]  Samy Bengio,et al.  Using Web Co-occurrence Statistics for Improving Image Categorization , 2013, ArXiv.

[26]  Amit K. Roy-Chowdhury,et al.  Context-Aware Modeling and Recognition of Activities in Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[28]  Olga Veksler,et al.  Graph Cuts in Vision and Graphics: Theories and Applications , 2006, Handbook of Mathematical Models in Computer Vision.