Multi-label visual classification with label exclusive context

We introduce in this paper a novel approach to multi-label image classification which incorporates a new type of context — label exclusive context — with linear representation and classification. Given a set of exclusive label groups that describe the negative relationship among class labels, our method, namely LELR for Label Exclusive Linear Representation, enforces repulsive assignment of the labels from each group to a query image. The problem can be formulated as an exclusive Lasso (eLasso) model with group overlaps and affine transformation. Since existing eLasso solvers are not directly applicable to solving such an variant of eLasso in our setting, we propose a Nesterov's smoothing approximation algorithm for efficient optimization. Extensive comparing experiments on the challenging real-world visual classification benchmarks demonstrate the effectiveness of incorporating label exclusive context into visual classification.

[1]  Naonori Ueda,et al.  Parametric Mixture Models for Multi-Labeled Text , 2002, NIPS.

[2]  Yihong Gong,et al.  Multi-labelled classification using maximum entropy method , 2005, SIGIR '05.

[3]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[4]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[5]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[6]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[8]  Massimo Fornasier,et al.  Recovery Algorithms for Vector-Valued Data with Joint Sparsity Constraints , 2008, SIAM J. Numer. Anal..

[9]  Jean-Philippe Vert,et al.  Group lasso with overlap and graph lasso , 2009, ICML '09.

[10]  Shuicheng Yan,et al.  Semi-supervised Learning by Sparse Representation , 2009, SDM.

[11]  P. Zhao,et al.  The composite absolute penalties family for grouped and hierarchical variable selection , 2009, 0909.0411.

[12]  Han Liu,et al.  Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery , 2009, ICML '09.

[13]  Jeff A. Bilmes,et al.  Entropic Graph Regularization in Non-Parametric Semi-Supervised Classification , 2009, NIPS.

[14]  Shuicheng Yan,et al.  Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[15]  Charless C. Fowlkes,et al.  Discriminative Models for Multi-Class Object Layout , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Meng Wang,et al.  Unified Video Annotation via Multigraph Learning , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[18]  Bruno Torrésani,et al.  Sparsity and persistence: mixed norms provide simple signal models with dependent coefficients , 2009, Signal Image Video Process..

[19]  M. Kowalski Sparse regression using mixed norms , 2009 .

[20]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[21]  Meng Wang,et al.  Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation , 2009, IEEE Transactions on Multimedia.

[22]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Shuicheng Yan,et al.  Robust Graph Mode Seeking by Graph Shift , 2010, ICML.

[26]  Thomas S. Huang,et al.  Image Classification Using Super-Vector Coding of Local Image Descriptors , 2010, ECCV.

[27]  Antonio Torralba,et al.  Exploiting hierarchical context on a large database of object categories , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Shuicheng Yan,et al.  Visual classification with multi-task joint sparse representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Shuicheng Yan,et al.  Efficient large-scale image annotation by probabilistic collaborative multi-label propagation , 2010, ACM Multimedia.

[30]  Rong Jin,et al.  Exclusive Lasso for Multi-task Feature Selection , 2010, AISTATS.

[31]  Emmanuel J. Candès,et al.  NESTA: A Fast and Accurate First-Order Method for Sparse Recovery , 2009, SIAM J. Imaging Sci..