Feature Selection With Missing Labels Using Multilabel Fuzzy Neighborhood Rough Sets and Maximum Relevance Minimum Redundancy

Recently, multilabel classification has generated considerable research interest. However, the high dimensionality of multilabel data incurs high costs; moreover, in many real applications, a number of labels of training samples are randomly missed. Thus, multilabel classification can have great complexity and ambiguity, which means some feature selection methods exhibit poor robustness and yield low prediction accuracy. To solve these issues, this paper presents a novel feature selection method based on multilabel fuzzy neighborhood rough sets (MFNRS) and maximum relevance minimum redundancy (MRMR) that can be used on multilabel data with missing labels. First, to handle multilabel data with missing labels, a relation coefficient of samples, label complement matrix, and label-specific feature matrix are constructed and implemented in a linear regression model to recover missing labels. Second, the margin-based fuzzy neighborhood radius, fuzzy neighborhood similarity relationship, and fuzzy neighborhood information granule are developed. The MFNRS model is built based on multilabel neighborhood rough sets combined with fuzzy neighborhood rough sets. Based on algebra and information views, certain fuzzy neighborhood entropy-based uncertainty measures are proposed for MFNRS. The fuzzy neighborhood mutual information-based MRMR model with label correlation is improved to evaluate the performance of candidate features. Finally, a feature selection algorithm is designed to improve the performance for multilabel data with missing labels. Experiments on twenty datasets verify that our method is effective not only for recovering missing labels but also for selecting significant features with better classification performance.

[1]  Wei Liu,et al.  Multi-Label Feature Selection using Correlation Information , 2017, CIKM.

[2]  Jiucheng Xu,et al.  Feature genes selection based on fuzzy neighborhood conditional entropy , 2019, J. Intell. Fuzzy Syst..

[3]  Wei Xue,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Probabilistic Multi-Label Classification with Sparse Feature Learning , 2022 .

[4]  Hossein Nezamabadi-pour,et al.  A label-specific multi-label feature selection algorithm based on the Pareto dominance concept , 2019, Pattern Recognit..

[5]  ZhouZhi-Hua,et al.  Multilabel dimensionality reduction via dependence maximization , 2010 .

[6]  Ping Zhang,et al.  Multi-label feature selection with shared common mode , 2020, Pattern Recognit..

[7]  Yuwen Li,et al.  Attribute reduction for multi-label learning with fuzzy rough set , 2018, Knowl. Based Syst..

[8]  Witold Pedrycz,et al.  Multiple Relevant Feature Ensemble Selection Based on Multilayer Co-Evolutionary Consensus MapReduce , 2020, IEEE Transactions on Cybernetics.

[9]  Shunxiang Wu,et al.  Online multi-label streaming feature selection based on neighborhood rough set , 2018, Pattern Recognit..

[10]  Guoxian Yu,et al.  Feature selection with missing labels based on label compression and local feature correlation , 2020, Neurocomputing.

[11]  Peter Bugata,et al.  On some aspects of minimum redundancy maximum relevance feature selection , 2019, Science China Information Sciences.

[12]  Zhi-Hua Zhou,et al.  Multi-Label Learning with Global and Local Label Correlation , 2017, IEEE Transactions on Knowledge and Data Engineering.

[13]  Zili Zhang,et al.  Incomplete Multi-View Weak-Label Learning , 2018, IJCAI.

[14]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[15]  Yuhong Guo,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Multi-Label Classification Using Conditional Dependency Networks , 2022 .

[16]  Tommy W. S. Chow,et al.  Robust non-negative sparse graph for semi-supervised multi-label learning with missing labels , 2018, Inf. Sci..

[17]  Jianhua Dai,et al.  Label Distribution Feature Selection Based on Mutual Information in Fuzzy Rough Set Theory , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[18]  Ming-Wen Shao,et al.  Feature subset selection based on fuzzy neighborhood rough sets , 2016, Knowl. Based Syst..

[19]  Yang Gao,et al.  Joint multi-label classification and label correlations with missing labels and feature selection , 2019, Knowl. Based Syst..

[20]  Abhinav Gupta,et al.  A maximum relevancy and minimum redundancy feature selection approach for median filtering forensics , 2020, Multimedia Tools and Applications.

[21]  Alberto Cano,et al.  Distributed Selection of Continuous Features in Multilabel Classification Using Mutual Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Dae-Won Kim,et al.  MFC: Initialization method for multi-label feature selection based on conditional mutual information , 2020, Neurocomputing.

[23]  Qinghua Hu,et al.  Multi-label feature selection with missing labels , 2018, Pattern Recognit..

[24]  José Antonio Lozano,et al.  Mutual information based feature subset selection in multivariate time series classification , 2020, Pattern Recognit..

[25]  Yi Yang,et al.  A Convex Formulation for Semi-Supervised Multi-Label Feature Selection , 2014, AAAI.

[26]  Jian Yu,et al.  Semi-supervised low-rank mapping learning for multi-label classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Jiucheng Xu,et al.  Feature Selection Using Fuzzy Neighborhood Entropy-Based Uncertainty Measures for Fuzzy Neighborhood Multigranulation Rough Sets , 2021, IEEE Transactions on Fuzzy Systems.

[28]  Ping Zhang,et al.  Distinguishing two types of labels for multi-label feature selection , 2019, Pattern Recognit..

[29]  Lingyu Xu,et al.  Multi-label feature selection algorithm based on label pairwise ranking comparison transformation , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[30]  Chang Liu,et al.  Multi-label Feature Selection Method Combining Unbiased Hilbert-Schmidt Independence Criterion with Controlled Genetic Algorithm , 2018, ICONIP.

[31]  Menglei Lin,et al.  Multi-Label Attribute Reduction Based on Variable Precision Fuzzy Neighborhood Rough Set , 2020, IEEE Access.

[32]  Nicu Sebe,et al.  Web Image Annotation Via Subspace-Sparsity Collaborated Feature Selection , 2012, IEEE Transactions on Multimedia.

[33]  Yibin Zhang,et al.  Multi-Label Feature Selection Based on Mutual Information , 2018, 2018 14th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).

[34]  Xindong Wu,et al.  Learning Label Specific Features for Multi-label Classification , 2015, 2015 IEEE International Conference on Data Mining.

[35]  Dae-Won Kim,et al.  Feature selection for multi-label classification using multivariate mutual information , 2013, Pattern Recognit. Lett..

[36]  Qinghua Hu,et al.  Streaming Feature Selection for Multilabel Learning Based on Fuzzy Mutual Information , 2017, IEEE Transactions on Fuzzy Systems.

[37]  Zhi-Hua Zhou,et al.  Multilabel dimensionality reduction via dependence maximization , 2008, TKDD.

[38]  Yinglong Wang,et al.  Mutual information-based label distribution feature selection for multi-label learning , 2020, Knowl. Based Syst..

[39]  Qinghua Hu,et al.  Multi-label feature selection based on max-dependency and min-redundancy , 2015, Neurocomputing.

[40]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Víctor Robles,et al.  Feature selection for multi-label naive Bayes classification , 2009, Inf. Sci..

[42]  Hossein Nezamabadi-pour,et al.  MLACO: A multi-label feature selection algorithm based on ant colony optimization , 2020, Knowl. Based Syst..

[43]  Min-Ling Zhang,et al.  Lift: Multi-Label Learning with Label-Specific Features , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Jinghua Liu,et al.  Feature selection for multi-label learning with missing labels , 2019, Applied Intelligence.

[45]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[46]  Jianhua Dai,et al.  Feature selection via normative fuzzy information weight with application into tumor classification , 2020, Appl. Soft Comput..

[47]  Gaël Richard,et al.  Confidence-based Weighted Loss for Multi-label Classification with Missing Labels , 2020, ICMR.

[48]  Jiucheng Xu,et al.  Multilabel feature selection using ML-ReliefF and neighborhood mutual information for multilabel neighborhood decision systems , 2020, Inf. Sci..

[49]  Sarah Vluymans,et al.  Multi-label classification using a fuzzy rough neighborhood consensus , 2018, Inf. Sci..

[50]  Qian Yuhua,et al.  Feature Selection for Multi-Label Classification Based on Neighborhood Rough Sets , 2015 .

[51]  Qingming Huang,et al.  Improving multi-label classification with missing labels by learning label-specific features , 2019, Inf. Sci..

[52]  Wei Liu,et al.  Correlated Multi-label Classification with Incomplete Label Space and Class Imbalance , 2019, ACM Trans. Intell. Syst. Technol..

[53]  William Zhu,et al.  Multi-label feature selection via feature manifold learning and sparsity regularization , 2018, Int. J. Mach. Learn. Cybern..