An efficient multi-label learning method with label projection

Abstract Multi-label classification (MLC) is a problem that each given sample is associated with more than one label simultaneously. There is a variety of application in our daily life, such as text categorization and image annotation. To date, many methodologies are proposed to do a multi-label learning task. According to the MLC setting, we put forward an MLC method called TPMLC (an MLC method with Two Parts) and propose a uniform loss function based on the variational inference with Bayesian and Gaussian distribution assumption. Moreover, this uniform loss function is composed of two parts. On one hand, the first part is about the determination of the relationship between sample and multiple labels, so we adopt a set of multiple support vector machines (SVMs) to determine this relationship. On the other hand, the second part in this uniform loss function is about the determination of the relationship among multiple labels, and thus we construct a projection matrix model to determine this relationship. Furthermore, this uniform loss function is optimized simultaneously, such that the two kinds of relationships can be optimized at the same time. Besides, we also present the convergence analysis and computational complexity analysis of the method. After that, in the experiment part, the comparison of TPMLC with state-of-the-art approaches manifests the feasibility and the competitive performance in classification. In addition, the statistic results show that the proposed method performs better than the state-of-the-art methods.

[1]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.

[2]  Krzysztof J. Cios,et al.  Review of ensembles of multi-label classifiers: Models, experimental study and prospects , 2018, Inf. Fusion.

[3]  L. Baxter Random Fields on a Network: Modeling, Statistics, and Applications , 1996 .

[4]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[5]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[6]  Yu-Chiang Frank Wang,et al.  Learning Deep Latent Spaces for Multi-Label Classification , 2017, ArXiv.

[7]  Eyke Hüllermeier,et al.  Label ranking by learning pairwise preferences , 2008, Artif. Intell..

[8]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[9]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[10]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[11]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[12]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[13]  Grigorios Tsoumakas,et al.  Random K-labelsets for Multilabel Classification , 2022 .

[14]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[15]  Eric C. C. Tsang,et al.  Nesting One-Against-One Algorithm Based on SVMs for Pattern Classification , 2008, IEEE Transactions on Neural Networks.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Luca Martino,et al.  Efficient monte carlo optimization for multi-label classifier chains , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[19]  Min-Ling Zhang,et al.  Feature-Induced Labeling Information Enrichment for Multi-Label Learning , 2018, AAAI.

[20]  Qingming Huang,et al.  Joint Feature Selection and Classification for Multilabel Learning , 2018, IEEE Transactions on Cybernetics.

[21]  Eyke Hüllermeier,et al.  Bayes Optimal Multilabel Classification via Probabilistic Classifier Chains , 2010, ICML.

[22]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[23]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[24]  Jason Weston,et al.  Support vector machines for multi-class pattern recognition , 1999, ESANN.

[25]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[26]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[27]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[28]  Longbing Cao,et al.  SVM-based multi-state-mapping approach for multi-class classification , 2017, Knowl. Based Syst..

[29]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[30]  Geoff Holmes,et al.  Multi-label Classification Using Ensembles of Pruned Sets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[31]  Andrew McCallum,et al.  Collective multi-label classification , 2005, CIKM '05.

[32]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[33]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[34]  Saso Dzeroski,et al.  An extensive experimental comparison of methods for multi-label learning , 2012, Pattern Recognit..

[35]  Xindong Wu,et al.  Learning Label-Specific Features and Class-Dependent Labels for Multi-Label Classification , 2016, IEEE Transactions on Knowledge and Data Engineering.

[36]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[37]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[38]  S. Sathiya Keerthi,et al.  Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms , 2002, IEEE Trans. Neural Networks.

[39]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[40]  Guangwen Yang,et al.  An EnKF-based scheme to optimize hyper-parameters and features for SVM classifier , 2017, Pattern Recognit..

[41]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[42]  Lei Wu,et al.  Lift: Multi-Label Learning with Label-Specific Features , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Stephen J. Roberts,et al.  A tutorial on variational Bayesian inference , 2012, Artificial Intelligence Review.

[44]  Christian P. Robert,et al.  Machine Learning, a Probabilistic Perspective , 2014 .

[45]  Naonori Ueda,et al.  Parametric Mixture Models for Multi-Labeled Text , 2002, NIPS.

[46]  Zheng Chen,et al.  P-packSVM: Parallel Primal grAdient desCent Kernel SVM , 2009, 2009 Ninth IEEE International Conference on Data Mining.