Multi-label Learning with Highly Incomplete Data via Collaborative Embedding

Tremendous efforts have been dedicated to improving the effectiveness of multi-label learning with incomplete label assignments. Most of the current techniques assume that the input features of data instances are complete. Nevertheless, the co-occurrence of highly incomplete features and weak label assignments is a challenging and widely perceived issue in real-world multi-label learning applications due to a number of practical reasons including incomplete data collection, moderate labels from annotators, etc. Existing multi-label learning algorithms are not directly applicable when the observed features are highly incomplete. In this work, we attack this problem by proposing a weakly supervised multi-label learning approach, based on the idea of collaborative embedding. This approach provides a flexible framework to conduct efficient multi-label classification at both transductive and inductive mode by coupling the process of reconstructing missing features and weak label assignments in a joint optimisation framework. It is designed to collaboratively recover feature and label information, and extract the predictive association between the feature profile and the multi-label tag of the same data instance. Substantial experiments on public benchmark datasets and real security event data validate that our proposed method can provide distinctively more accurate transductive and inductive classification than other state-of-the-art algorithms.

[1]  Lei Zhang,et al.  Multi-label sparse coding for automatic image annotation , 2009, CVPR.

[2]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[3]  Zhi-Hua Zhou,et al.  Multi-Label Learning with Weak Label , 2010, AAAI.

[4]  Gang Niu,et al.  Analysis of Learning from Positive and Unlabeled Data , 2014, NIPS.

[5]  Inderjit S. Dhillon,et al.  Large-scale Multi-label Learning with Missing Labels , 2013, ICML.

[6]  Miao Xu,et al.  Speedup Matrix Completion with Side Information: Application to Multi-Label Learning , 2013, NIPS.

[7]  Yuhong Guo,et al.  Semi-Supervised Multi-Label Learning with Incomplete Labels , 2015, IJCAI.

[8]  Rong Jin,et al.  Multi-label learning with incomplete class assignments , 2011, CVPR 2011.

[9]  Robert D. Nowak,et al.  Transduction with Matrix Completion: Three Birds with One Stone , 2010, NIPS.

[10]  Jianmin Wang,et al.  Image Tag Completion via Image-Specific and Tag-Specific Linear Sparse Reconstructions , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[12]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[13]  Philip S. Yu,et al.  Text classification without negative examples revisit , 2006, IEEE Transactions on Knowledge and Data Engineering.

[14]  Lei Wu,et al.  Tag Completion for Image Retrieval , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Nagarajan Natarajan,et al.  PU Learning for Matrix Completion , 2014, ICML.

[16]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[17]  Pedro Larrañaga,et al.  Feature subset selection from positive and unlabelled examples , 2009, Pattern Recognit. Lett..

[18]  S. Sathiya Keerthi,et al.  A pairwise ranking based approach to learning with positive and unlabeled examples , 2011, CIKM '11.

[19]  Alexandre Bernardino Matrix Completion for Image Classification , 2011, NIPS 2011.

[20]  Patrick O. Perry,et al.  Bi-cross-validation of the SVD and the nonnegative matrix factorization , 2009, 0908.2062.

[21]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[22]  Dale Schuurmans,et al.  Convex Co-embedding , 2014, AAAI.

[23]  Charles Elkan,et al.  Learning classifiers from only positive and unlabeled data , 2008, KDD.

[24]  Grigorios Tsoumakas,et al.  Random K-labelsets for Multilabel Classification , 2022 .

[25]  Dong Liu,et al.  Image retagging , 2010, ACM Multimedia.

[26]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[28]  Shuicheng Yan,et al.  Image tag refinement towards low-rank, content-tag prior and error sparsity , 2010, ACM Multimedia.

[29]  Xiaoli Li,et al.  Learning to Classify Texts Using Positive and Unlabeled Data , 2003, IJCAI.

[30]  Yuhong Guo,et al.  Convex Co-Embedding for Matrix Completion with Predictive Side Information , 2017, AAAI.

[31]  Inderjit S. Dhillon,et al.  Goal-Directed Inductive Matrix Completion , 2016, KDD.

[32]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[33]  Inderjit S. Dhillon,et al.  Matrix Completion with Noisy Side Information , 2015, NIPS.