Ensemble selection with joint spectral clustering and structural sparsity

Abstract Generally, ensemble selection techniques are split into two categories: dynamic and static. Static ensemble selection selects a fixed subset of the original ensemble which improves the space complexity but is not flexible to each test instance. Dynamic ensemble selection selects base learners on-the-fly according to each test instance but it does not significantly improve the complexity. Currently, there is no ensemble selection technique that is robust to the test instances as well as improves space complexity. To narrow this gap, we propose a novel static ensemble selection method, called Ensemble Selection with Joint Spectral Clustering and Structural Sparsity. This method integrates spectral clustering and structural sparsity into a joint framework whose ensemble selection result is robust to test instances and consumes less space. Using 25 datasets from KEEL and UCI, we demonstrate the effectiveness of our proposed algorithm and its promising performance compared to that of other state-of-the-art algorithms.

[1]  Xuelong Li,et al.  Unsupervised Feature Selection with Structured Graph Optimization , 2016, AAAI.

[2]  George D. C. Cavalcanti,et al.  DESlib: A Dynamic ensemble selection library in Python , 2018, J. Mach. Learn. Res..

[3]  Christian Jutten,et al.  A Novel Pruning Approach for Bagging Ensemble Regression Based on Sparse Representation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[5]  Jing Liu,et al.  Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection , 2014, IEEE Transactions on Knowledge and Data Engineering.

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  Saptarshi Chakraborty,et al.  Detecting Meaningful Clusters From High-Dimensional Data: A Strongly Consistent Sparse Center-Based Clustering Approach , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[9]  George D. C. Cavalcanti,et al.  Dynamic classifier selection: Recent advances and perspectives , 2018, Inf. Fusion.

[10]  George D. C. Cavalcanti,et al.  FIRE-DES++: Enhanced Online Pruning of Base Classifiers for Dynamic Ensemble Selection , 2018, Pattern Recognit..

[11]  Muhammad Ammar Ali,et al.  Classification of Motor Imagery Task by Using Novel Ensemble Pruning Approach , 2020, IEEE Transactions on Fuzzy Systems.

[12]  Marek Kurzynski,et al.  A measure of competence based on random classification for dynamic ensemble selection , 2012, Inf. Fusion.

[13]  Rongrong Ji,et al.  Nonnegative Spectral Clustering with Discriminative Regularization , 2011, AAAI.

[14]  Robert Sabourin,et al.  From dynamic classifier selection to dynamic ensemble selection , 2008, Pattern Recognit..

[15]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[16]  Bartlomiej Antosik,et al.  New Measures of Classifier Competence - Heuristics and Application to the Design of Multiple Classifier Systems , 2011, Computer Recognition Systems 4.

[17]  Yun Yang,et al.  Multi-label classification with weighted classifier selection and stacked ensemble , 2020, Inf. Sci..

[18]  George D. C. Cavalcanti,et al.  META-DES: A dynamic ensemble selection framework using meta-learning , 2015, Pattern Recognit..

[19]  Lin Li,et al.  A novel dynamic ensemble selection classifier for an imbalanced data set: An application for credit risk assessment , 2020, Knowl. Based Syst..

[20]  Xindong Wu,et al.  Dynamic classifier selection for effective mining from noisy data streams , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[21]  Yide Ma,et al.  Robust unsupervised feature selection via matrix factorization , 2017, Neurocomputing.

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  Huanhuan Chen,et al.  Ensemble Pruning Based on Objection Maximization With a General Distributed Framework , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[24]  ChengXiang Zhai,et al.  Robust Unsupervised Feature Selection , 2013, IJCAI.

[25]  Swagatam Das,et al.  Entropy Weighted Power k-Means Clustering , 2020, AISTATS.

[26]  Francisco Herrera,et al.  Dynamic ensemble selection for multi-class imbalanced datasets , 2018, Inf. Sci..

[27]  Nojun Kwak,et al.  Feature extraction for classification problems and its application to face recognition , 2008, Pattern Recognit..

[28]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[29]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[30]  Marek Kurzynski,et al.  A probabilistic model of classifier competence for dynamic ensemble selection , 2011, Pattern Recognit..

[31]  Jiashun Jin,et al.  Influential Feature PCA for high dimensional clustering , 2014, 1407.5241.

[32]  Debolina Paul,et al.  A Bayesian non‐parametric approach for automatic clustering with feature weighting , 2020, Stat.

[33]  José Fco. Martínez-Trinidad,et al.  A review of unsupervised feature selection methods , 2019, Artificial Intelligence Review.

[34]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[35]  Robert Tibshirani,et al.  A Framework for Feature Selection in Clustering , 2010, Journal of the American Statistical Association.

[36]  Ronghua Shang,et al.  Unsupervised feature selection based on kernel fisher discriminant analysis and regression learning , 2018, Machine Learning.

[37]  Yang Yu,et al.  Diversity Regularized Ensemble Pruning , 2012, ECML/PKDD.

[38]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[39]  Yang Yu,et al.  Pareto Ensemble Pruning , 2015, AAAI.

[40]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.