Stel component analysis: Modeling spatial correlations in image class structure

As a useful concept in the study of the low level image class structure, we introduce the notion of a structure element - `stel.' The notion is related to the notions of a pixel, superpixel, segment or a part, but instead of referring to an element or a region of a single image, stel is a probabilistic element of an entire image class. Stels often define clear object or scene parts as a consequence of the modeling constraint which forces the regions belonging to a single stel to have a tight distribution over local measurements, such as color or texture. This self-similarity within a region in a single image is typical of most meaningful image parts, even when in different images of similar objects the corresponding parts may not have similar local measurements. The stel itself is expected to be consistent within a class, yet flexible, which we accomplish using a novel approach we dubbed stel component analysis. Experimental results show how stel component analysis can assist in image/video segmentation and object recognition where, in particular, it can be used as an alternative of, or in conjunction with, bag-of-features and related classifiers, where stel inference provides a meaningful spatial partition of features.

[1]  D. B. Graham,et al.  Characterising Virtual Eigensignatures for General Purpose Face Recognition , 1998 .

[2]  Jitendra Malik,et al.  Geometric blur for template matching , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[3]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[4]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[5]  N. Jojic,et al.  Capturing image structure with probabilistic index maps , 2004, CVPR 2004.

[6]  Nebojsa Jojic,et al.  LOCUS: learning object classes with unsupervised segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[7]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Nebojsa Jojic,et al.  Escaping local minima through hierarchical model selection: Automatic object discovery, segmentation, and tracking in video , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Dariu Gavrila,et al.  An Experimental Study on Pedestrian Classification , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[12]  Cordelia Schmid,et al.  Spatial Weighting for Bag-of-Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.