Image Classification With Kernelized Spatial-Context

The goal of image classification is to classify a collection of unlabeled images into a set of semantic classes. Many methods have been proposed to approach this goal by leveraging visual appearances of local patches in images. However, the spatial context between these local patches also provides significant information to improve the classification accuracy. Traditional spatial contextual models, such as two-dimensional hidden Markov model, attempt to construct one common model for each image category to depict the spatial structures of the images in this class. However due to large intra-class variances in an image category, one single model has difficulties in representing various spatial contexts in different images. In contrast, we propose to construct a prototype set of spatial contextual models by leveraging the kernel methods rather than only one model. Such an algorithm combines the advantages of rich representation ability of spatial contextual models as well as the powerful classification ability of kernel method. In particular, we propose a new distance measure between different spatial contextual models by integrating joint appearance-spatial image features. Such a distance measure can be efficiently computed in a recursive formulation that scales well to image size. Extensive experiments demonstrate that the proposed approach significantly outperforms the state-of-the-art approaches.

[1]  Bernard Mérialdo,et al.  A New Approach to Probabilistic Image Modeling with Multidimensional Hidden Markov Models , 2006, Adaptive Multimedia Retrieval.

[2]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Frank K. Soong,et al.  Divergence-Based Similarity Measure for Spoken Document Retrieval , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[4]  Nicu Sebe,et al.  Distance Learning for Similarity Estimation , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[6]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  James Ze Wang,et al.  Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[9]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[10]  Koen E. A. van de Sande,et al.  Evaluation of color descriptors for object and scene recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Trevor Darrell,et al.  Conditional Random Fields for Object Recognition , 2004, NIPS.

[12]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[13]  Jiayu Tang,et al.  A Study of Quality Issues for Image Auto-Annotation With the Corel Dataset , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  Jitendra Malik,et al.  SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Robert M. Gray,et al.  Image classification by a two-dimensional hidden Markov model , 2000, IEEE Trans. Signal Process..

[16]  Michael Brady,et al.  Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.

[17]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Shih-Fu Chang,et al.  Fast kernel learning for spatial pyramid matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Horace Ho-Shing Ip,et al.  Automatic Semantic Annotation of Images using Spatial Hidden Markov Model , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[20]  Xian-Sheng Hua,et al.  A joint appearance-spatial distance for kernel-based image categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[22]  Yoram Singer,et al.  Batch and On-Line Parameter Estimation of Gaussian Mixtures Based on the Joint Entropy , 1998, NIPS.

[23]  Chiou-Shann Fuh,et al.  Local Ensemble Kernel Learning for Object Category Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Minh N. Do,et al.  Rotation invariant texture characterization and retrieval using steerable wavelet-domain hidden Markov models , 2002, IEEE Trans. Multim..

[25]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.