Learning a distance metric from multi-instance multi-label data

Multi-instance multi-label learning (MIML) refers to the learning problems where each example is represented by a bag/collection of instances and is labeled by multiple labels. An example application of MIML is visual object recognition in which each image is represented by multiple key points (i.e., instances) and is assigned to multiple object categories. In this paper, we study the problem of learning a distance metric from multi-instance multi-label data. It is significantly more challenging than the conventional setup of distance metric learning because it is difficult to associate instances in a bag with its assigned class labels. We propose an iterative algorithm for MIML distance metric learning: it first estimates the association between instances in a bag and its assigned class labels, and learns a distance metric from the estimated association by a discriminative analysis; the learned metric will be used to update the association between instances and class labels, which is further used to improve the learning of distance metric. We evaluate the proposed algorithm by the task of automated image annotation, a well known MIML problem. Our empirical study shows an encouraging result when combining the proposed algorithm with citation-kNN, a state-of-the-art algorithm for multi-instance learning.

[1]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[2]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[3]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[4]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[5]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[6]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[7]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[8]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[9]  Misha Pavel,et al.  Adjustment Learning and Relevant Component Analysis , 2002, ECCV.

[10]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[11]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[12]  Zhi-Hua Zhou,et al.  Ensembles of Multi-instance Learners , 2003, ECML.

[13]  Xin Xu,et al.  Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[14]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[15]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[16]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[17]  James T. Kwok,et al.  A regularization framework for multiple-instance learning , 2006, ICML.

[18]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[19]  Thomas Hofmann,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2007 .

[20]  Zhi-Hua Zhou,et al.  Multi-Label Learning by Instance Differentiation , 2007, AAAI.

[21]  Tao Mei,et al.  Joint multi-label multi-instance learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Zhi-Hua Zhou,et al.  M3MIML: A Maximum Margin Method for Multi-instance Multi-label Learning , 2008, 2008 Eighth IEEE International Conference on Data Mining.