A k -Nearest Neighbor Based Algorithm for Multi-Instance Multi-Label Active Learning

Multi-instance multi-label learning (MIML) is a framework in machine learning in which each object is represented by multiple instances and associated with multiple labels. This relatively new approach has achieved success in various applications, particularly those involving learning from complex objects. Because of the complexity of MIML, the cost of data labeling increases drastically along with the improvement of the model performance. In this paper, we introduce a MIML active learning approach to reduce the labeling costs of MIML data without compromising the model performance. Based on a query strategy, we select and request from the Oracle the label set of the most informative object. Our approach is formulated in a pool-based scenario and uses Miml-\(k\) nn as the base classifier. This classifier for MIML is based on the \(k\)-Nearest Neighbor algorithm and has achieved superior performance in different data domains. We proposed novel query strategies and also implemented previously used query strategies for MIML learning. Finally, we conducted an experimental evaluation on various benchmark datasets. We demonstrate that these approaches can achieve significantly improved results than without active selection for all datasets on various evaluation criteria.

[1]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[2]  Xiaoli Z. Fern,et al.  Acoustic classification of multiple simultaneous bird species: a multi-instance multi-label approach. , 2012, The Journal of the Acoustical Society of America.

[3]  Zhi-Hua Zhou,et al.  Ensemble multi-instance multi-label learning approach for video annotation task , 2011, ACM Multimedia.

[4]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[5]  Jieping Ye,et al.  Drosophila Gene Expression Pattern Annotation through Multi-Instance Multi-Label Learning , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  Ramesh Nallapati,et al.  Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[7]  Min-Ling Zhang,et al.  A k-Nearest Neighbor Based Multi-Instance Multi-Label Learning Algorithm , 2010, 2010 22nd IEEE International Conference on Tools with Artificial Intelligence.

[8]  Günther Palm,et al.  Semi-supervised learning for tree-structured ensembles of RBF networks with Co-Training , 2010, Neural Networks.

[9]  Friedhelm Schwenker,et al.  Pattern classification and clustering: A review of partially supervised learning approaches , 2014, Pattern Recognit. Lett..

[10]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[11]  Jun Jiao,et al.  Multi-instance multi-label learning for automatic tag recommendation , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[12]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[13]  Friedhelm Schwenker,et al.  Semi-supervised Learning , 2013, Handbook on Neural Information Processing.

[14]  Xiaoli Z. Fern,et al.  Rank-loss support instance machines for MIML instance annotation , 2012, KDD.

[15]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[16]  Zhi-Hua Zhou,et al.  Fast Multi-Instance Multi-Label Learning , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Zhi-Hua Zhou,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2006, NIPS.

[18]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[19]  Zhi-Hua Zhou,et al.  Genome-Wide Protein Function Prediction through Multi-Instance Multi-Label Learning , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[20]  Zhi-Hua Zhou,et al.  Multi-instance multi-label learning , 2008, Artif. Intell..

[21]  Songcan Chen,et al.  Multi-instance multi-label active learning , 2017, IJCAI.

[22]  Patrick Thiam,et al.  A Temporal Dependency Based Multi-modal Active Learning Approach for Audiovisual Event Detection , 2017, Neural Processing Letters.

[23]  Friedhelm Schwenker,et al.  Active Multi-Instance Multi-Label Learning , 2014, ECDA.