Multi-label boosting for image annotation by structural grouping sparsity

We can obtain high-dimensional heterogenous features from real-world images to describe their various aspects of visual characteristics, such as color, texture and shape etc.Different kinds of heterogenous features have different intrinsic discriminative power for image understanding. The selection of groups of discriminative features for certain semantics is hence crucial to make the image understanding more interpretable. This paper formulates the multi-label image annotation as a regression model with a regularized penalty. We call it Multi-label Boosting by the selection of heterogeneous features with structural Grouping Sparsity (MtBGS). MtBGS induces a (structural ) sparse selection model to identify subgroups of homogenous features for predicting a certain label. Moreover, the correlations among multiple tags are utilized in MtBGS to boost the performance of multi-label annotation. Extensive experiments on public image datasets show that the proposed approach has better multi-label image annotation performance and leads to a quite interpretable model for image understanding.

[1]  Trevor J. Hastie,et al.  Sparse Discriminant Analysis , 2011, Technometrics.

[2]  Tong Zhang,et al.  Text Categorization Based on Regularized Linear Classification Methods , 2001, Information Retrieval.

[3]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[4]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .

[5]  S. Sathiya Keerthi,et al.  A simple and efficient algorithm for gene selection using sparse logistic regression , 2003, Bioinform..

[6]  David D. Lewis,et al.  Evaluating Text Categorization I , 1991, HLT.

[7]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[8]  Jiebo Luo,et al.  Heterogeneous feature machines for visual recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[10]  J. Friedman,et al.  Predicting Multivariate Responses in Multiple Linear Regression , 1997 .

[11]  Jianping Fan,et al.  Integrating Concept Ontology and Multitask Learning to Achieve More Effective Classifier Training for Multilevel Image Annotation , 2008, IEEE Transactions on Image Processing.

[12]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[13]  ZhouZhi-Hua,et al.  Multilabel dimensionality reduction via dependence maximization , 2010 .

[14]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[16]  Rupert G. Miller The jackknife-a review , 1974 .

[17]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[18]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[19]  R. Tibshirani,et al.  A note on the group lasso and a sparse group lasso , 2010, 1001.0736.

[20]  Jieping Ye,et al.  Extracting shared subspace for multi-label classification , 2008, KDD.

[21]  C. Braak Discussion to 'Predicting multivariate responses in multiple linear regression' by L. Breiman & J.H. Friedman , 1997 .

[22]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[23]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[24]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[25]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[26]  Shuicheng Yan,et al.  Multi-label sparse coding for automatic image annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Yueting Zhuang,et al.  Multi-Task Sparse Discriminant Analysis (MtSDA) with Overlapping Categories , 2010, AAAI.

[28]  Zhi-Hua Zhou,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2006, NIPS.

[29]  Rong Jin,et al.  Correlated Label Propagation with Application to Multi-label Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .