Contextual decomposition of multi-label images

Most research on image decomposition, e.g. image segmentation and image parsing, has predominantly focused on the low-level visual clues within single image and neglected the contextual information across different images. In this paper, we present a new perspective to image decomposition piloted by the multi-labels associated with individual images. Observing that the context information (i.e., local label representations of the same label are similar while those from different labels are dissimilar) exists across different images, we propose to perform image decomposition in a collective way, and then the image decomposition problem is formulated as an optimization which maximizes inter-label difference and at the same time minimizes intra-label difference of the target label representations. Such contextual image decomposition has a wide variety of applications, among which the two exemplary ones are: 1) multi-label image annotation in which the sparse coding of a query image over the bases consisting of all learned label representations naturally produces the multi-label annotation, and 2) label ranking in which the annotated labels are re-ordered according to the sparse coding coefficients on those learned label representations. It is worth noting that these two applications can be performed simultaneously via the label propagation process in sparse coding.

[1]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[2]  Tao Mei,et al.  Joint multi-label multi-instance learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Bo Zhang,et al.  Exploiting spatial context constraints for automatic image region annotation , 2007, ACM Multimedia.

[4]  Zhuowen Tu,et al.  Image Parsing: Unifying Segmentation, Detection, and Recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  M. A. Bhatti,et al.  Practical Optimization Methods with Mathematica Applications (& CD-ROM) , 2002, J. Oper. Res. Soc..

[6]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[7]  Li Fei-Fei,et al.  Spatially coherent latent topic model for concurrent object segmentation and classification , 2007 .

[8]  B. S. Manjunath,et al.  Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[12]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  W. Eric L. Grimson,et al.  Learning coupled conditional random field for image decomposition with application on object categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[15]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[16]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[17]  In-So Kweon,et al.  A semantic region descriptor for local feature based image categorization , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Marcel Worring,et al.  Learning tag relevance by neighbor voting for social image retrieval , 2008, MIR '08.

[19]  Ralph Gross,et al.  Concurrent Object Recognition and Segmentation by Graph Partitioning , 2002, NIPS.

[20]  Thomas Hofmann,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2007 .