Random Field Model for Integration of Local Information and Global Information

This paper presents a proposal of a general framework that explicitly models local information and global information in a conditional random field. The proposed method extracts global image features as well as local ones and uses them to predict the scene of the input image. Scene-based top-down information is generated based on the predicted scene. It represents a global spatial configuration of labels and category compatibility over an image. Incorporation of the global information helps to resolve local ambiguities and achieves locally and globally consistent image recognition. In spite of the model's simplicity, the proposed method demonstrates good performance in image labeling of two datasets.

[1]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[2]  Osamu Hasegawa,et al.  Integration of Top-down and Bottom-up Information for Image Labeling , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  R. Zemel,et al.  Multiscale conditional random fields for image labeling , 2004, CVPR 2004.

[4]  Anat Levin,et al.  Learning to Combine Bottom-Up and Top-Down Segmentation , 2006, International Journal of Computer Vision.

[5]  Martial Hebert,et al.  A hierarchical field framework for unified context-based classification , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  Antonio Torralba,et al.  Contextual Models for Object Detection Using Boosted Random Fields , 2004, NIPS.

[7]  Richard S. Zemel,et al.  Learning and Incorporating Top-Down Cues in Image Segmentation , 2006, ECCV.

[8]  Alexei A. Efros,et al.  Recovering Surface Layout from an Image , 2007, International Journal of Computer Vision.

[9]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.