Histogram Contextualization

Histograms have been widely used for feature representation in image and video content analysis. However, due to the orderless nature of the summarization process, histograms generally lack spatial information. This may degrade their discrimination capability in visual classification tasks. Although there have been several research attempts to encode spatial context into histograms, how to extend the encodings to higher order spatial context is still an open problem. In this paper,we propose a general histogram contextualization method to encode efficiently higher order spatial context. The method is based on the cooccurrence of local visual homogeneity patterns and hence is able to generate more discriminative histogram representations while remaining compact and robust. Moreover, we also investigate how to extend the histogram contextualization to multiple modalities of context. It is shown that the proposed method can be naturally extended to combine both temporal and spatial context and facilitate video content analysis. In addition, a method to combine cross-feature context with spatial context via the technique of random forest is also introduced in this paper. Comprehensive experiments on face image classification and human activity recognition tasks demonstrate the superiority of the proposed histogram contextualization method compared with the existing encoding methods.

[1]  Frédéric Jurie,et al.  Fast Discriminative Visual Codebooks using Randomized Clustering Forests , 2006, NIPS.

[2]  Stefano Soatto,et al.  Relaxed matching kernels for robust image comparison , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Bingbing Ni,et al.  Contextualizing histogram , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Andrew Blake,et al.  Contour-based learning for object detection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[5]  Nando de Freitas,et al.  A Statistical Model for General Contextual Object Recognition , 2004, ECCV.

[6]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[7]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[9]  Patrick J. Flynn,et al.  Overview of the face recognition grand challenge , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Mário A. T. Figueiredo,et al.  Segmentation and Classification of Human Activities , 2005 .

[11]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[12]  Mário A. T. Figueiredo,et al.  Recognition of human activities using space dependent switched dynamical models , 2005, IEEE International Conference on Image Processing 2005.

[13]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[15]  Tao Wang,et al.  One step beyond histograms: Image representation using Markov stationary features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Pietro Perona,et al.  Mutual Boosting for Contextual Inference , 2003, NIPS.

[17]  Kurt Hornik,et al.  The support vector machine under test , 2003, Neurocomputing.

[18]  Cordelia Schmid,et al.  A maximum entropy framework for part-based texture and object recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[19]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Cordelia Schmid,et al.  Actions in context , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Yali Amit,et al.  Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[22]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[23]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression Database , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[25]  Silvio Savarese,et al.  Discriminative Object Class Models of Appearance and Shape by Correlatons , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  Brian A. Baertlein,et al.  Feature-Level and Decision-Level Fusion of Noncoincidently Sampled Sensors for Land Mine Detection , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Andrew Zisserman,et al.  Extending Pictorial Structures for Object Recognition , 2004, BMVC.

[28]  Michel Vidal-Naquet,et al.  A Fragment-Based Approach to Object Representation and Classification , 2001, IWVF.

[29]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[30]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[32]  Robert B. Fisher,et al.  Hidden Markov Models for Optical Flow Analysis in Crowds , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[33]  Pedro Ribeiro,et al.  Human Activity Recognition from Video: modeling, feature selection and classification architecture , 2005 .

[34]  Shree K. Nayar,et al.  Multiresolution histograms and their use for recognition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Lee C. Potter,et al.  Model-based Bayesian feature matching with application to synthetic aperture radar target recognition , 2001, Pattern Recognit..

[36]  Jing Huang,et al.  Spatial Color Indexing and Applications , 2004, International Journal of Computer Vision.

[37]  Stefano Soatto,et al.  Proximity Distribution Kernels for Geometric Context in Category Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[38]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[39]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.