Witness identification in multiple instance learning using random subspaces

Multiple instance learning (MIL) is a form of weakly-supervised learning where instances are organized in bags. A label is provided for bags, but not for instances. MIL literature typically focuses on the classification of bags seen as one object, or as a combination of their instances. In both cases, performance is generally measured using labels assigned to entire bags. In this paper, the MIL problem is formulated as a knowledge discovery task for which algorithms seek to discover the witnesses (i.e. identifying positive instances), using the weak supervision provided by bag labels. Some MIL methods are suitable for instance classification, but perform poorly in application where the witness rate is low, or when the positive class distribution is multimodal. A new method that clusters data projected in random subspaces is proposed to perform witness identification in these adverse settings. The proposed method is assessed on MIL data sets from three application domains, and compared to 7 reference MIL algorithms for the witness identification task. The proposed algorithm constantly ranks among the best methods in all experiments, while all other methods perform unevenly across data sets.

[1]  Zhi-Hua Zhou,et al.  Solving multi-instance problems with classifier ensemble based on constructive clustering , 2007, Knowledge and Information Systems.

[2]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[3]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[4]  Mathieu Lamard,et al.  Multiple-Instance Learning for Anomaly Detection in Digital Mammography , 2016, IEEE Transactions on Medical Imaging.

[5]  Razvan C. Bunescu,et al.  Multiple instance learning for sparse positive bags , 2007, ICML '07.

[6]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[7]  Bram van Ginneken,et al.  On Combining Multiple-Instance Learning and Active Learning for Computer-Aided Detection of Tuberculosis , 2016, IEEE Transactions on Medical Imaging.

[8]  Ghyslain Gagnon M A Carbonneau E Granger,et al.  Robust multiple-instance learning ensembles using random subspace instance selection , 2016 .

[9]  Xiaoli Z. Fern,et al.  Rank-loss support instance machines for MIML instance annotation , 2012, KDD.

[10]  David J. Slate,et al.  Letter Recognition Using Holland-Style Adaptive Classifiers , 1991, Machine Learning.

[11]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[12]  Ian W. Ricketts,et al.  The Mammographic Image Analysis Society digital mammogram database , 1994 .

[13]  Melih Kandemir,et al.  Digital pathology: Multiple instance learning can detect Barrett's cancer , 2014, 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI).

[14]  Rangaraj M. Rangayyan,et al.  Gradient and texture analysis for the classification of mammographic masses , 2000, IEEE Transactions on Medical Imaging.

[15]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[17]  Jaume Amores,et al.  Multiple instance classification: Review, taxonomy and comparative study , 2013, Artif. Intell..

[18]  Zhi-Hua Zhou,et al.  Locating Regions of Interest in CBIR with Multi-instance Learning Techniques , 2005, Australian Conference on Artificial Intelligence.

[19]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[20]  Marco Loog,et al.  Dissimilarity-Based Ensembles for Multiple Instance Learning , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Hendrik Blockeel,et al.  Instance-level accuracy versus bag-level accuracy in multi-instance learning , 2011, Data Mining and Knowledge Discovery.

[22]  Ivor W. Tsang,et al.  A Convex Method for Locating Regions of Interest with Multi-instance Learning , 2009, ECML/PKDD.

[23]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[24]  Bram van Ginneken,et al.  A Novel Multiple-Instance Learning-Based Approach to Computer-Aided Detection of Tuberculosis on Chest X-Rays , 2015, IEEE Transactions on Medical Imaging.

[25]  Mark Craven,et al.  Supervised versus multiple instance learning: an empirical comparison , 2005, ICML.

[26]  Alessandro Vinciarelli,et al.  Automatic personality perception: Prediction of trait attribution based on prosodic features extended abstract , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[27]  D. Palachanis Using the Multiple Instance Learning framework to address differential regulation , 2014 .