Multiple instance learning with bag dissimilarities

Multiple instance learning (MIL) is concerned with learning from sets (bags) of objects (instances), where the individual instance labels are ambiguous. In this setting, supervised learning cannot be applied directly. Often, specialized MIL methods learn by making additional assumptions about the relationship of the bag labels and instance labels. Such assumptions may fit a particular dataset, but do not generalize to the whole range of MIL problems. Other MIL methods shift the focus of assumptions from the labels to the overall (dis)similarity of bags, and therefore learn from bags directly. We propose to represent each bag by a vector of its dissimilarities to other bags in the training set, and treat these dissimilarities as a feature representation. We show several alternatives to define a dissimilarity between bags and discuss which definitions are more suitable for particular MIL problems. The experimental results show that the proposed approach is computationally inexpensive, yet very competitive with state-of-the-art algorithms on a wide range of MIL datasets. HighlightsA general bag dissimilarities framework for multiple instance learning is explored.Point set distances and distribution distances are considered.Metric dissimilarities are not necessarily more informative.Results are competitive with, or outperform state-of-the-art algorithms.Practical suggestions for end-users are provided.

[1]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[2]  Xiaoli Z. Fern,et al.  Acoustic classification of multiple simultaneous bird species: a multi-instance multi-label approach. , 2012, The Journal of the Acoustical Society of America.

[3]  Hui Zhang,et al.  Localized Content-Based Image Retrieval , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[5]  Mark Craven,et al.  Supervised versus multiple instance learning: an empirical comparison , 2005, ICML.

[6]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[8]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[9]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[10]  Jun Zhang,et al.  On Generalized Multiple-instance Learning , 2005, Int. J. Comput. Intell. Appl..

[11]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[13]  Horst Bunke,et al.  Non-Euclidean or Non-metric Measures Can Be Informative , 2006, SSPR/SPR.

[14]  Marco Loog,et al.  Does one rotten apple spoil the whole barrel? , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[15]  Hongbin Zha,et al.  Adaptive p-posterior mixture-model kernels for multiple instance learning , 2008, ICML '08.

[16]  Katarzyna Musial,et al.  On Accuracy of PDF Divergence Estimators and Their Applicability to Representative Data Sampling , 2011, Entropy.

[17]  James R. Foulds,et al.  A review of multi-instance learning assumptions , 2010, The Knowledge Engineering Review.

[18]  Marco Loog,et al.  Static posterior probability fusion for signal detection: applications in the detection of interstitial diseases in chest radiographs , 2004, ICPR 2004.

[19]  Wan-Jui Lee,et al.  Bag Dissimilarities for Multiple Instance Learning , 2011, SIMBAD.

[20]  Robert P. W. Duin,et al.  Multiple-instance learning as a classifier combining problem , 2013, Pattern Recognit..

[21]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[22]  Robert P. W. Duin,et al.  A Generalized Kernel Approach to Dissimilarity-based Classification , 2002, J. Mach. Learn. Res..

[23]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[24]  Bernhard Pfahringer,et al.  A Two-Level Learning Method for Generalized Multi-instance Problems , 2003, ECML.

[25]  Marco Loog,et al.  Combining Instance Information to Classify Bags , 2013, MCS.

[26]  Maria-Florina Balcan,et al.  A theory of learning with similarity functions , 2008, Machine Learning.

[27]  Wan-Jui Lee,et al.  Bridging Structure and Feature Representations in Graph Matching , 2012, Int. J. Pattern Recognit. Artif. Intell..

[28]  Yong Deng,et al.  A new Hausdorff distance for image matching , 2005, Pattern Recognit. Lett..

[29]  Zhi-Hua Zhou,et al.  Multi-instance learning by treating instances as non-I.I.D. samples , 2008, ICML '09.

[30]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.

[31]  Robert P. W. Duin,et al.  A study on semi-supervised dissimilarity representation , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[32]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[33]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[34]  Robert P. W. Duin,et al.  Learning Curves for the Analysis of Multiple Instance Classifiers , 2008, SSPR/SPR.

[35]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[36]  Zhi-Hua Zhou,et al.  Multi-instance clustering with applications to multi-instance prediction , 2009, Applied Intelligence.

[37]  Anil K. Jain,et al.  A modified Hausdorff distance for object matching , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[38]  Zhi-Hua Zhou,et al.  Multi-Instance Learning Based Web Mining , 2005, Applied Intelligence.

[39]  Robert P. W. Duin,et al.  A Matlab Toolbox for Pattern Recognition , 2004 .

[41]  Marco Loog,et al.  Class-Dependent Dissimilarity Measures for Multiple Instance Learning , 2012, SSPR/SPR.

[42]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[43]  Wan-Jui Lee,et al.  Dissimilarity-Based Multiple Instance Learning , 2010, SSPR/SPR.

[44]  Robert P. W. Duin,et al.  Prototype selection for dissimilarity-based classifiers , 2006, Pattern Recognit..

[45]  R. Duin,et al.  The dissimilarity representation for pattern recognition , a tutorial , 2009 .

[46]  Robert P.W. Duin,et al.  PRTools3: A Matlab Toolbox for Pattern Recognition , 2000 .

[47]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Robert P. W. Duin,et al.  Dissimilarity representations allow for building good classifiers , 2002, Pattern Recognit. Lett..

[49]  Robert P. W. Duin,et al.  On Using Asymmetry Information for Classification in Extended Dissimilarity Spaces , 2012, CIARP.

[50]  King-Sun Fu,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.