Multi-label classification with weighted classifier selection and stacked ensemble

Abstract Multi-label classification has attracted increasing attention in various applications, such as medical diagnosis and semantic annotation. With such trend, a large number of ensemble approaches have been proposed for multi-label classification tasks. Most of these approaches construct the ensemble members by using bagging schemes, but few stacked ensemble approaches are developed. Existing research on stacked ensemble approaches remains active, but several issues remain such as (1) little has been done to learn the weights of classifiers for classifier selection; (2) the relationship between pairwise label correlations and multi-label classification performance has not been investigated sufficiently. To address these issues, we propose a novel stacked ensemble approach that simultaneously exploits label correlations and the process of learning weights of ensemble members. In our approach, first, a weighted stacked ensemble with sparsity regularization is developed to facilitate classifier selection and ensemble members construction for multi-label classification. Second, in order to improve the classification performance, the pairwise label correlations are further considered for determining weights of these ensemble members. Finally, we develop an optimization algorithm based on both of the accelerated proximal gradient and the block coordinate descent techniques to achieve the optimal ensemble solution efficiently. Extensive experiments on publicly available datasets and real Cardiovascular and Cerebrovascular Disease datasets demonstrate that our proposed algorithm outperforms related state-of-the-art methods from perspectives of benchmarking and real-world applications.

[1]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[2]  K Khorasani,et al.  An ensemble of dynamic neural network identifiers for fault detection and isolation of gas turbine engines , 2016, Neural Networks.

[3]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[4]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[5]  Grigorios Tsoumakas,et al.  Multi-target regression via input space expansion: treating targets as inputs , 2012, Machine Learning.

[6]  Dacheng Tao,et al.  Multi-label Subspace Ensemble , 2012, AISTATS.

[7]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[8]  José R. Dorronsoro,et al.  Accelerated Block Coordinate Descent for Sparse Group Lasso , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[9]  Geoff Holmes,et al.  MEKA: A Multi-label/Multi-target Extension to WEKA , 2016, J. Mach. Learn. Res..

[10]  P. Alam ‘A’ , 2021, Composites Engineering: An A–Z Guide.

[11]  Hakan Erdogan,et al.  Max-Margin Stacking and Sparse Regularization for Linear Classifier Combination and Selection , 2011, ArXiv.

[12]  Sarah Vluymans,et al.  Multi-label classification using a fuzzy rough neighborhood consensus , 2018, Inf. Sci..

[13]  Yun Yang,et al.  Bi-weighted ensemble via HMM-based approaches for temporal data clustering , 2018, Pattern Recognit..

[14]  Jianmin Jiang,et al.  Adaptive Bi-Weighting Toward Automatic Initialization and Model Selection for HMM-Based Hybrid Meta-Clustering Ensembles , 2019, IEEE Transactions on Cybernetics.

[15]  Hsuan-Tien Lin,et al.  Multilabel Classification with Principal Label Space Transformation , 2012, Neural Computation.

[16]  ChengXiang Zhai,et al.  Multi-label literature classification based on the Gene Ontology graph , 2008, BMC Bioinformatics.

[17]  Penalized Regression Methods for Linear Models in SAS/STAT , 2015 .

[18]  John Z. Zhang,et al.  Enhancing multi-label music genre classification through ensemble techniques , 2011, SIGIR.

[19]  Grigorios Tsoumakas,et al.  MULAN: A Java Library for Multi-Label Learning , 2011, J. Mach. Learn. Res..

[20]  Saso Dzeroski,et al.  An extensive experimental comparison of methods for multi-label learning , 2012, Pattern Recognit..

[21]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[22]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[23]  Yun Yang,et al.  Hybrid Sampling-Based Clustering Ensemble With Global and Local Constitutions , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Mark W. Schmidt,et al.  Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection , 2015, ICML.

[25]  Lu Sun,et al.  Multi-label classification with meta-label-specific features , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[26]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[27]  Hamed R. Bonab,et al.  A Novel Online Stacked Ensemble for Multi-Label Stream Classification , 2018, CIKM.

[28]  Min Wu,et al.  Multi-label ensemble based on variable pairwise constraint projection , 2013, Inf. Sci..

[29]  Yang Zhang,et al.  Mining Multi-label Concept-Drifting Data Streams Using Dynamic Classifier Ensemble , 2009, ACML.

[30]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[31]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.

[32]  Krzysztof J. Cios,et al.  Review of ensembles of multi-label classifiers: Models, experimental study and prospects , 2018, Inf. Fusion.

[33]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[34]  Qingming Huang,et al.  Joint Feature Selection and Classification for Multilabel Learning , 2018, IEEE Transactions on Cybernetics.

[35]  Grigorios Tsoumakas,et al.  Random K-labelsets for Multilabel Classification , 2022 .

[36]  Haytham Elghazel,et al.  Ensemble Multi-label Classification: A Comparative Study on Threshold Selection and Voting Methods , 2015, 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI).

[37]  Kim-Chuan Toh,et al.  A Unified Formulation and Fast Accelerated Proximal Gradient Method for Classification , 2017, J. Mach. Learn. Res..

[38]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[39]  Tomasz Kajdanowicz,et al.  A scikit-based Python environment for performing multi-label classification , 2017, ArXiv.

[40]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[41]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[42]  Shengli Wu,et al.  A geometric framework for data fusion in information retrieval , 2015, Inf. Syst..

[43]  Saso Dzeroski,et al.  Ensembles of Multi-Objective Decision Trees , 2007, ECML.

[44]  P. Alam ‘S’ , 2021, Composites Engineering: An A–Z Guide.

[45]  Chaoyang Zhang,et al.  Multi-Label Symptom Analysis and Modeling of TCM Diagnosis of Hypertension , 2018, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[46]  Hongyuan Zha,et al.  Deep Extreme Multi-label Learning , 2017, ICMR.

[47]  Xindong Wu,et al.  Learning Label Specific Features for Multi-label Classification , 2015, 2015 IEEE International Conference on Data Mining.

[48]  Min Tang,et al.  Max-plus and min-plus projection autoassociative morphological memories and their compositions for pattern classification , 2018, Neural Networks.

[49]  Sebastián Ventura,et al.  A Tutorial on Multilabel Learning , 2015, ACM Comput. Surv..

[50]  Arun K. Pujari,et al.  Group Preserving Label Embedding for Multi-Label Classification , 2018, Pattern Recognit..

[51]  Xingquan Zhu,et al.  Task Sensitive Feature Exploration and Learning for Multitask Graph Classification. , 2017, IEEE transactions on cybernetics.

[52]  Dianhui Wang,et al.  High dimensional data regression using Lasso model and neural networks with random weights , 2016, Inf. Sci..

[53]  Hong Shen,et al.  Weighted Ensemble Classification of Multi-label Data Streams , 2017, PAKDD.

[54]  Geoff Holmes,et al.  Multi-label Classification Using Ensembles of Pruned Sets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[55]  Grigorios Tsoumakas,et al.  Correlation-Based Pruning of Stacked Binary Relevance Models for Multi-Label Learning , 2009 .

[56]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[57]  Chengqi Zhang,et al.  Task Sensitive Feature Exploration and Learning for Multitask Graph Classification , 2017, IEEE Transactions on Cybernetics.

[58]  Noah Simon,et al.  A Sparse-Group Lasso , 2013 .