Bag Dissimilarities for Multiple Instance Learning

When objects cannot be represented well by single feature vectors, a collection of feature vectors can be used. This is what is done in Multiple Instance learning, where it is called a bag of instances. By using a bag of instances, an object gains more internal structure than when a single feature vector is used. This improves the expressiveness of the representation, but also adds complexity to the classification of the object. This paper shows that for the situation that not a single instance determines the class label of a bag, simple bag dissimilarity measures can significantly outperform standard multiple instance classifiers. In particular a measure that computes just the average minimum distance between instances, or a measure that uses the Earth Mover's distance, perform very well.

[1]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[2]  Edwin R. Hancock,et al.  Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop, SSPR&SPR 2010, Cesme, Izmir, Turkey, August 18-20, 2010. Proceedings , 2010, SSPR/SPR.

[3]  James T. Kwok,et al.  Marginalized Multi-Instance Kernels , 2007, IJCAI.

[4]  Mark Craven,et al.  Supervised versus multiple instance learning: an empirical comparison , 2005, ICML.

[5]  Xin Xu,et al.  Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[6]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Hendrik Blockeel,et al.  Machine Learning: ECML 2003 , 2003, Lecture Notes in Computer Science.

[8]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[9]  Jitendra Malik,et al.  Blobworld: A System for Region-Based Image Indexing and Retrieval , 1999, VISUAL.

[10]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[11]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[12]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[13]  Wan-Jui Lee,et al.  Dissimilarity-Based Multiple Instance Learning , 2010, SSPR/SPR.

[14]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[15]  Zhi-Hua Zhou,et al.  Multi-Instance Learning Based Web Mining , 2005, Applied Intelligence.

[16]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[17]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[18]  Bernhard Pfahringer,et al.  A Two-Level Learning Method for Generalized Multi-instance Problems , 2003, ECML.

[19]  Thomas Hofmann,et al.  Conformal Multi-Instance Kernels , 2006, NIPS 2006.

[20]  Louis Vuurpijl,et al.  Using Pen-Based Outlines for Object-Based Annotation and Image-Based Queries , 1999, VISUAL.

[21]  Anil K. Jain,et al.  A modified Hausdorff distance for object matching , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[22]  David G. Stork,et al.  Pattern Classification , 1973 .

[23]  Elzbieta Pekalska,et al.  The Dissimilarity representations in pattern recognition. Concepts, theory and applications. , 2005 .

[24]  Robert P. W. Duin,et al.  The dissimilarity space: Bridging structural and statistical pattern recognition , 2012, Pattern Recognit. Lett..

[25]  Thomas Hofmann,et al.  Multiple instance learning with generalized support vector machines , 2002, AAAI/IAAI.

[26]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.