Multi-Instance Multi-Label Learning Combining Hierarchical Context and its Application to Image Annotation

In image annotation, one image is often modeled as a bag of regions (“instances”) associated with multiple labels, which is a typical application of multi-instance multi-label learning (MIML). Although lots of research has shown that the interplay embedded among instances and labels can largely boost the image annotation accuracy, most existing MIML methods consider none or partial context cues. In this paper, we propose a novel context-aware MIML model to integrate the instance context and label context into a general framework. Specially, the instance context is constructed with multiple graphs, while the label context is built up through a linear combination of several common latent conceptions that link low level features and high level semantic labels. Comparison with other leading methods on several benchmark datasets in terms of image annotation shows that our proposed method can get better performance than the state-of-the-art approaches.

[1]  Adnan Yazici,et al.  Towards Effective Image Classification Using Class-Specific Codebooks and Distinctive Local Features , 2015, IEEE Transactions on Multimedia.

[2]  Zhi-Hua Zhou,et al.  Multi-Modal Image Annotation with Multi-Instance Multi-Label LDA , 2013, IJCAI.

[3]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[4]  Shuang-Hong Yang,et al.  Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora , 2009, NIPS.

[5]  Weifeng Liu,et al.  Multiview Hessian Regularization for Image Annotation , 2013, IEEE Transactions on Image Processing.

[6]  Jitendra Malik,et al.  Normalized Cut and Image Segmentation , 1997 .

[7]  Ramesh Nallapati,et al.  Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[8]  Sam J. Maglio,et al.  Emotional category data on images from the international affective picture system , 2005, Behavior research methods.

[9]  Zhi-Hua Zhou,et al.  Towards Discovering What Patterns Trigger What Labels , 2012, AAAI.

[10]  P. Lang International Affective Picture System (IAPS) : Technical Manual and Affective Ratings , 1995 .

[11]  Chris H. Q. Ding,et al.  Image annotation using bi-relational graph of images and semantic labels , 2011, CVPR 2011.

[12]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Zhi-Hua Zhou,et al.  Ensemble multi-instance multi-label learning approach for video annotation task , 2011, ACM Multimedia.

[14]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[15]  Meng Wang,et al.  MSRA-MM 2.0: A Large-Scale Web Multimedia Dataset , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[16]  C. Berg,et al.  Harmonic Analysis on Semigroups , 1984 .

[17]  Tao Mei,et al.  Joint multi-label multi-instance learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Jieping Ye,et al.  Drosophila Gene Expression Pattern Annotation through Multi-Instance Multi-Label Learning , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[20]  Dapeng Tao,et al.  Manifold regularized kernel logistic regression for web image annotation , 2013, Neurocomputing.

[21]  Bing Li,et al.  Multi-Cue Illumination Estimation via a Tree-Structured Group Joint Sparse Representation , 2015, International Journal of Computer Vision.

[22]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[23]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[24]  Zhi-Hua Zhou,et al.  Multi-instance multi-label learning , 2008, Artif. Intell..

[25]  Mark W. Schmidt,et al.  Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches , 2007, ECML.

[26]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[27]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[28]  Xin Xu,et al.  Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[29]  Yuan Yan Tang,et al.  Multiview Hessian discriminative sparse coding for image annotation , 2013, Comput. Vis. Image Underst..

[30]  Thomas Hofmann,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2007 .

[31]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[32]  Zhi-Hua Zhou,et al.  Learning a distance metric from multi-instance multi-label data , 2009, CVPR.

[33]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[34]  Min-Ling Zhang,et al.  A k-Nearest Neighbor Based Multi-Instance Multi-Label Learning Algorithm , 2010, 2010 22nd IEEE International Conference on Tools with Artificial Intelligence.

[35]  D. Bertsekas,et al.  TWO-METRIC PROJECTION METHODS FOR CONSTRAINED OPTIMIZATION* , 1984 .

[36]  S. Barry Cooper,et al.  Rounding-off Errors in Matrix Processes , 2013 .

[37]  Xiaoli Z. Fern,et al.  Rank-loss support instance machines for MIML instance annotation , 2012, KDD.

[38]  Liang Wang,et al.  Unconstrained Multimodal Multi-Label Learning , 2015, IEEE Transactions on Multimedia.

[39]  Zhi-Hua Zhou,et al.  Multi-instance learning by treating instances as non-I.I.D. samples , 2008, ICML '09.

[40]  Xuelong Li,et al.  Hessian Regularized Support Vector Machines for Mobile Image Annotation on the Cloud , 2013, IEEE Transactions on Multimedia.

[41]  Allan Hanbury,et al.  Affective image classification using features inspired by psychology and art theory , 2010, ACM Multimedia.

[42]  Natsuda Kaothanthong,et al.  A feature-word-topic model for image annotation , 2010, CIKM '10.

[43]  G. Wahba Support Vector Machines, Reproducing Kernel Hilbert Spaces and the Randomized GACV 1 , 1998 .

[44]  Zhi-Hua Zhou,et al.  Learning a distance metric from multi-instance multi-label data , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Francesco Orabona,et al.  Learning from Candidate Labeling Sets , 2010, NIPS.

[46]  Tao Mei,et al.  Multi-Layer Multi-Instance Learning for Video Concept Detection , 2008, IEEE Transactions on Multimedia.

[47]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[48]  Shuicheng Yan,et al.  Horror Image Recognition Based on Context-Aware Multi-Instance Learning , 2015, IEEE Transactions on Image Processing.

[49]  Chris H. Q. Ding,et al.  Image annotation using multi-label correlated Green's function , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[50]  S. Sinha Methods of nonlinear programming , 2006 .

[51]  Hal Daumé,et al.  Learning Task Grouping and Overlap in Multi-task Learning , 2012, ICML.

[52]  Bing Li,et al.  Multi-Perspective Cost-Sensitive Context-Aware Multi-Instance Sparse Coding and Its Application to Sensitive Video Recognition , 2016, IEEE Transactions on Multimedia.

[53]  B. S. Manjunath,et al.  Multi-Label Learning With Fused Multimodal Bi-Relational Graph , 2014, IEEE Transactions on Multimedia.

[54]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[55]  Jieping Ye,et al.  A shared-subspace learning framework for multi-label classification , 2010, TKDD.

[56]  Nam Nguyen,et al.  A New SVM Approach to Multi-instance Multi-label Learning , 2010, 2010 IEEE International Conference on Data Mining.

[57]  Bing Li,et al.  Horror Video Scene Recognition Based on Multi-view Multi-instance Learning , 2012, ACCV.

[58]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[59]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[60]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..