An Efficient Stacking Model of Multi-Label Classification Based on Pareto Optimum

Nowadays, multi-label data are ubiquitous in real-world applications, in which each instance is associated with a set of labels. Multi-label learning has attracted significant attentions from researchers and plenty of algorithms have been proposed. Among those algorithms binary relevance (BR) is a widely used framework for multi-label classification. It constructs binary classifiers for each label by means of one-vs-rest style. BR approach is a simple and straight forward way of problem transformation for multi-label learning, but it ignores label correlations totally. Stacking based BR is a feasible way to tackle this problem. The key issue of stacking based BR is how to select label subset to extend the original features for each label. Existing methods of stacking based BR usually select identical label subset for all labels. It may be suboptimal as each label has its own most related label subset. In this paper, a novel stacking based method is introduced to utilize label correlations based on Pareto Optimum for improving the performance of BR. Our method builds a stack of two layers of BR classifiers. At the first layer, a group of binary classifiers are constructed, one for a label. At the second layer, for each label we employ Pareto Optimum to select most related label subset, then augment the original features by the selected label subset. The final binary classifiers for each label are constructed based on their corresponding reconstructed feature space. Comparing to other well-established stacking multi-label learning algorithms in terms of different multi-label classification criteria, experimental results on several multi-label benchmark datasets testify the superiority of the proposed methods.

[1]  H. T. Kung,et al.  On the Average Number of Maxima in a Set of Vectors and Applications , 1978, JACM.

[2]  Zhi-Hua Zhou,et al.  Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization , 2006, IEEE Transactions on Knowledge and Data Engineering.

[3]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[4]  Grigorios Tsoumakas,et al.  Multi-label classification of music by emotion , 2011, EURASIP J. Audio Speech Music. Process..

[5]  Eyke Hüllermeier,et al.  Bayes Optimal Multilabel Classification via Probabilistic Classifier Chains , 2010, ICML.

[6]  Luca Martino,et al.  Efficient monte carlo methods for multi-dimensional learning with classifier chains , 2012, Pattern Recognit..

[7]  吴信东,et al.  Learning Label Specific Features for Multi-label Classification , 2015 .

[8]  Charles Elkan,et al.  Beam search algorithms for multilabel learning , 2013, Machine Learning.

[9]  Jun Ma,et al.  Correlation-Based Weighted K-Labelsets for Multi-label Classification , 2016, APWeb.

[10]  Grigorios Tsoumakas,et al.  Correlation-Based Pruning of Stacked Binary Relevance Models for Multi-Label Learning , 2009 .

[11]  Zhi-Hua Zhou,et al.  Multi-Label Learning with Global and Local Label Correlation , 2017, IEEE Transactions on Knowledge and Data Engineering.

[12]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[13]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.

[14]  Katja Hose,et al.  A survey of skyline processing in highly distributed environments , 2011, The VLDB Journal.

[15]  Qingming Huang,et al.  Categorizing Social Multimedia by Neighborhood Decision Using Local Pairwise Label Correlation , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[16]  Krzysztof J. Cios,et al.  Review of ensembles of multi-label classifiers: Models, experimental study and prospects , 2018, Inf. Fusion.

[17]  Min-Ling Zhang,et al.  Leveraging Implicit Relative Labeling-Importance Information for Effective Multi-Label Learning , 2019 .

[18]  John Langford,et al.  Multi-Label Prediction via Compressed Sensing , 2009, NIPS.

[19]  Newton Spolaôr,et al.  Filter Approach Feature Selection Methods to Support Multi-label Learning Based on ReliefF and Information Gain , 2012, SBIA.

[20]  Zhihai Wang,et al.  Stacking model of multi-label classification based on pruning strategies , 2018, Neural Computing and Applications.

[21]  Xindong Wu,et al.  Learning Label-Specific Features and Class-Dependent Labels for Multi-Label Classification , 2016, IEEE Transactions on Knowledge and Data Engineering.

[22]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[23]  Everton Alvares Cherman,et al.  Incorporating label dependency into the binary relevance framework for multi-label classification , 2012, Expert Syst. Appl..

[24]  Philip S. Yu,et al.  Under Consideration for Publication in Knowledge and Information Systems Gmlc: a Multi-label Feature Selection Framework for Graph Classification , 2011 .

[25]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[26]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[27]  Alex Alves Freitas,et al.  Simpler is Better: a Novel Genetic Algorithm to Induce Compact Multi-label Chain Classifiers , 2015, GECCO.

[28]  Alex Alves Freitas,et al.  A Genetic Algorithm for Optimizing the Label Ordering in Multi-label Classifier Chains , 2013, 2013 IEEE 25th International Conference on Tools with Artificial Intelligence.

[29]  Yang Gao,et al.  Joint multi-label classification and label correlations with missing labels and feature selection , 2019, Knowl. Based Syst..

[30]  EVA GIBAJA,et al.  A Tutorial on Multi-Label Learning , 2014 .

[31]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[32]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[33]  Xiao Li,et al.  Multi-label learning with discriminative features for each label , 2015, Neurocomputing.

[34]  Jiawei Han,et al.  Correlated multi-label feature selection , 2011, CIKM '11.

[35]  Grigorios Tsoumakas,et al.  Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[36]  Jie Duan,et al.  Multi-label feature selection based on neighborhood mutual information , 2016, Appl. Soft Comput..

[37]  Alex Alves Freitas,et al.  Distinct Chains for Different Instances: An Effective Strategy for Multi-label Classifier Chains , 2014, ECML/PKDD.

[38]  Shunxiang Wu,et al.  Multi-label learning based on label-specific features and local pairwise label correlation , 2018, Neurocomputing.

[39]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[40]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[41]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[42]  Lei Wu,et al.  Lift: Multi-Label Learning with Label-Specific Features , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Xindong Wu,et al.  Compressed labeling on distilled labelsets for multi-label learning , 2012, Machine Learning.

[44]  Hossein Nezamabadi-pour,et al.  A label-specific multi-label feature selection algorithm based on the Pareto dominance concept , 2019, Pattern Recognit..

[45]  Concha Bielza,et al.  Bayesian Chain Classifiers for Multidimensional Classification , 2011, IJCAI.

[46]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[47]  Qingming Huang,et al.  Multi-label classification by exploiting local positive and negative pairwise label correlation , 2017, Neurocomputing.

[48]  Lu Sun,et al.  Multi-label classification with meta-label-specific features , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).