Fragmentary Multi-Instance Classification

Multi-instance learning (MIL) has been extensively applied to various real tasks involving objects with bags of instances, such as in drugs and images. Previous studies on MIL assume that data are entirely complete. However, in many real tasks, the instance is fragmentary. In this article, we present probably the first study on multi-instance classification with fragmentary data. In our proposed framework, called fragmentary multi-instance classification (FIC), the fragmentary data are completed and the multi-instance classifier is learned jointly. To facilitate the integration between the completion and classifier learning, FIC establishes the weighting mechanism to measure the importance levels of different instances. To validate the compatibility of our framework, four typical MIL methods, including multi-instance support vector machine (MI-SVM), expectation maximization diverse density (EM-DD), citation- $K$ nearest neighbors (Citation-KNNs), and MIL with discriminative bag mapping (MILDM), are embedded into the framework to obtain the corresponding FIC versions. As an illustration, an efficient solving algorithm is developed to address the problem for MI-SVM, together with the proof of convergence behavior. The experimental results on various types of real-world datasets demonstrate the effectiveness.

[1]  Shao-Yuan Li,et al.  Partial Multi-View Clustering , 2014, AAAI.

[2]  Daniel P. Robinson,et al.  A primal-dual augmented Lagrangian , 2010, Computational Optimization and Applications.

[3]  Jitendra Malik,et al.  Blobworld: A System for Region-Based Image Indexing and Retrieval , 1999, VISUAL.

[4]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[5]  Hongjun Lu,et al.  DIRECT: a system for mining data value conversion rules from disparate data sources , 2002, Decis. Support Syst..

[6]  Alfonso Valencia,et al.  Evaluation of BioCreAtIvE assessment of task 2 , 2005, BMC Bioinformatics.

[7]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[8]  Changshui Zhang,et al.  Instance-level Semisupervised Multiple Instance Learning , 2008, AAAI.

[9]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .

[10]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[11]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[12]  Wenyu Liu,et al.  Revisiting multiple instance neural networks , 2016, Pattern Recognit..

[13]  Xindong Wu,et al.  Multi-Instance Learning with Discriminative Bag Mapping , 2018, IEEE Transactions on Knowledge and Data Engineering.

[14]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[16]  Wu-Jun Li,et al.  Localized content-based image retrieval through evidence region identification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Bing Li,et al.  Multi-View Multi-Instance Learning Based on Joint Sparse Representation and Multi-View Dictionary Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Sally A. Goldman,et al.  Multiple-Instance Learning of Real-Valued Data , 2001, J. Mach. Learn. Res..

[19]  Ping Li,et al.  Shared Gaussian Process Latent Variable Model for Incomplete Multiview Clustering , 2020, IEEE Transactions on Cybernetics.

[20]  Mark Craven,et al.  Supervised versus multiple instance learning: an empirical comparison , 2005, ICML.

[21]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[22]  John Wright,et al.  Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization , 2009, NIPS.

[23]  Max Welling,et al.  Attention-based Deep Multiple Instance Learning , 2018, ICML.

[24]  Jin Zhao,et al.  Deep Multiple Instance Learning-Based Spatial–Spectral Classification for PAN and MS Imagery , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[25]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[26]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[27]  Licheng Jiao,et al.  A fast tri-factorization method for low-rank matrix recovery and completion , 2013, Pattern Recognit..

[28]  Junyu Dong,et al.  Robust Photometric Stereo in a scattering medium via Low-Rank Matrix Completion and Recovery , 2016, 2016 9th International Conference on Human System Interactions (HSI).

[29]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[30]  Zhi-Hua Zhou,et al.  Multi-Instance Learning Based Web Mining , 2005, Applied Intelligence.

[31]  Zhi-Hua Zhou,et al.  Multi-Instance Learning with Key Instance Shift , 2017, IJCAI.

[32]  Xuelong Li,et al.  Image Annotation by Multiple-Instance Learning With Discriminative Feature Mapping and Selection , 2014, IEEE Transactions on Cybernetics.

[33]  Hui Zhang,et al.  Localized Content-Based Image Retrieval , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Henry Leung,et al.  Data fusion in intelligent transportation systems: Progress and challenges - A survey , 2011, Inf. Fusion.

[35]  Feiping Nie,et al.  Social Trust Prediction Using Rank-k Matrix Recovery , 2013, IJCAI.

[36]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[37]  Andrea Bergmann,et al.  Citation Indexing Its Theory And Application In Science Technology And Humanities , 2016 .

[38]  Farshad Fotouhi,et al.  Region based image annotation through multiple-instance learning , 2005, MULTIMEDIA '05.

[39]  Feiping Nie,et al.  New primal SVM solver with linear computational cost for big data classifications , 2014, ICML 2014.

[40]  Pei-Wei Lin,et al.  Modeling Measurement Errors and Missing Initial Values in Freeway Dynamic Origin–Destination Estimation Systems , 2006 .