New Graph Structured Sparsity Model for Multi-label Image Annotations

In multi-label image annotations, because each image is associated to multiple categories, the semantic terms (label classes) are not mutually exclusive. Previous research showed that such label correlations can largely boost the annotation accuracy. However, all existing methods only directly apply the label correlation matrix to enhance the label inference and assignment without further learning the structural information among classes. In this paper, we model the label correlations using the relational graph, and propose a novel graph structured sparse learning model to incorporate the topological constraints of relation graph in multi-label classifications. As a result, our new method will capture and utilize the hidden class structures in relational graph to improve the annotation results. In proposed objective, a large number of structured sparsity-inducing norms are utilized, thus the optimization becomes difficult. To solve this problem, we derive an efficient optimization algorithm with proved convergence. We perform extensive experiments on six multi-label image annotation benchmark data sets. In all empirical results, our new method shows better annotation results than the state-of-the-art approaches.

[1]  Jieping Ye,et al.  A shared-subspace learning framework for multi-label classification , 2010, TKDD.

[2]  Jiebo Luo,et al.  Multilabel machine learning and its application to semantic scene classification , 2003, IS&T/SPIE Electronic Imaging.

[3]  Volker Tresp,et al.  Multi-label informed latent semantic indexing , 2005, SIGIR '05.

[4]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[5]  Charles A. Micchelli,et al.  A Family of Penalty Functions for Structured Sparsity , 2010, NIPS.

[6]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[7]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[8]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[9]  Shannon L. Risacher,et al.  High-Order Multi-Task Feature Learning to Identify Longitudinal Phenotypic Markers for Alzheimer's Disease Progression Prediction , 2012, NIPS.

[10]  Chris H. Q. Ding,et al.  Image annotation using multi-label correlated Green's function , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Mingjing Li,et al.  Color texture moments for content-based image retrieval , 2002, Proceedings. International Conference on Image Processing.

[12]  Feiping Nie,et al.  Multi-View Clustering and Feature Learning via Structured Sparsity , 2013, ICML.

[13]  Eric P. Xing,et al.  Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity , 2009, ICML.

[14]  Michael I. Jordan,et al.  Multi-task feature selection , 2006 .

[15]  Chris H. Q. Ding,et al.  Multi-label Linear Discriminant Analysis , 2010, ECCV.

[16]  Chris H. Q. Ding,et al.  Multi-label ReliefF and F-statistic feature selections for image annotation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Chris H. Q. Ding,et al.  Image annotation using bi-relational graph of images and semantic labels , 2011, CVPR 2011.

[18]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Zhi-Hua Zhou,et al.  Multilabel dimensionality reduction via dependence maximization , 2008, TKDD.

[20]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..