Multi-label classification using a fuzzy rough neighborhood consensus

Abstract A multi-label dataset consists of observations associated with one or more outcomes. The traditional classification task generalizes to the prediction of several class labels simultaneously. In this paper, we propose a new nearest neighbor based multi-label method. The nearest neighbor approach remains an intuitive and effective way to solve classification problems and popular multi-label classifiers adhering to this paradigm include the MLKNN and IBLR methods. To classify an instance, our proposal derives a consensus among the labelsets of the nearest neighbors based on fuzzy rough set theory. This mathematical framework captures data uncertainty and offers a way to extract a labelset from the dataset that summarizes the information contained in the labelsets of the neighbors. In our experimental study, we compare the performance of our method with five other nearest neighbor based multi-label classifiers using five evaluation metrics commonly used in multi-label classification. Based on the results on both synthetic and real-world datasets, we are able to conclude that our method is a strong competitor to nearest neighbor based multi-label classifiers like MLKNN and IBLR.

[1]  Thierry Denoeux,et al.  Evidential Multi-Label Classification Approach to Learning from Data with Imprecise Labels , 2010, IPMU.

[2]  Rong Jin,et al.  Correlated Label Propagation with Application to Multi-label Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[4]  Theresa Beaubouef,et al.  Rough Sets , 2019, Lecture Notes in Computer Science.

[5]  Xia Geng An Improved Multi-label Classification Algorithm BRkNN , 2014 .

[6]  Thierry Denoeux,et al.  An Evidence-Theoretic k-Nearest Neighbor Rule for Multi-label Classification , 2009, SUM.

[7]  Francisco Herrera,et al.  Fuzzy rough classifiers for class imbalanced multi-instance data , 2016, Pattern Recognit..

[8]  Shichao Zhang,et al.  Shell-neighbor method and its application in missing data imputation , 2011, Applied Intelligence.

[9]  Thierry Denoeux,et al.  Fuzzy multi-label learning under veristic variables , 2010, International Conference on Fuzzy Systems.

[10]  Chris Cornelis,et al.  Applications of Fuzzy Rough Set Theory in Machine Learning: a Survey , 2015, Fundam. Informaticae.

[11]  David Zhang,et al.  Multi-Label Dictionary Learning for Image Annotation , 2016, IEEE Transactions on Image Processing.

[12]  C. L. Mallows NON-NULL RANKING MODELS. I , 1957 .

[13]  B. S. Manjunath,et al.  Multi-Label Learning With Fused Multimodal Bi-Relational Graph , 2014, IEEE Transactions on Multimedia.

[14]  Qinghua Hu,et al.  On Robust Fuzzy Rough Set Models , 2012, IEEE Transactions on Fuzzy Systems.

[15]  Zhi-Hua Zhou,et al.  Multi-instance multi-label learning , 2008, Artif. Intell..

[16]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[17]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[18]  Nele Verbiest,et al.  Fuzzy rough and evolutionary approaches to instance selection , 2014 .

[19]  Jiun-Hung Chen,et al.  A multi-label classification based approach for sentiment classification , 2015, Expert Syst. Appl..

[20]  Sebastián Ventura,et al.  Effective lazy learning algorithm based on a data gravitation model for multi-label learning , 2016, Inf. Sci..

[21]  Newton Spolaôr,et al.  A Framework to Generate Synthetic Multi-label Datasets , 2014, CLEI Selected Papers.

[22]  E. Hüllermeier,et al.  A Simple Instance-Based Approach to Multilabel Classification Using the Mallows Model , 2009 .

[23]  Xindong Wu,et al.  Neighbor selection for multilabel classification , 2016, Neurocomputing.

[24]  Chris Cornelis,et al.  Ordered Weighted Average Based Fuzzy Rough Sets , 2010, RSKT.

[25]  Chris Cornelis,et al.  Attribute selection with fuzzy decision reducts , 2010, Inf. Sci..

[26]  Shou-De Lin,et al.  A Ranking-based KNN Approach for Multi-Label Classification , 2012, ACML.

[27]  Anna Maria Radzikowska,et al.  A comparative study of fuzzy rough sets , 2002, Fuzzy Sets Syst..

[28]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[29]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[30]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.

[31]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decisionmaking , 1988, IEEE Trans. Syst. Man Cybern..

[32]  Eyke Hüllermeier,et al.  Combining instance-based learning and logistic regression for multilabel classification , 2009, Machine Learning.

[33]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[34]  D. Dubois,et al.  ROUGH FUZZY SETS AND FUZZY ROUGH SETS , 1990 .

[35]  Thierry Denoeux,et al.  Multi-label classification algorithm derived from K-nearest neighbor rule with label dependencies , 2008, 2008 16th European Signal Processing Conference.

[36]  Robert LIN,et al.  NOTE ON FUZZY SETS , 2014 .

[37]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[38]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[39]  Shie-Jue Lee,et al.  FSKNN: Multi-label text categorization based on fuzzy similarity and k nearest neighbors , 2012, Expert Syst. Appl..

[40]  Xue-wen Chen,et al.  Mr.KNN: soft relevance for multi-label classification , 2010, CIKM.

[41]  Qiuwen Zhang,et al.  MultiP-SChlo: Multi-label protein subchloroplast localization prediction , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[42]  Chris H. Q. Ding,et al.  Multi-Label Classification: Inconsistency and Class Balanced K-Nearest Neighbor , 2010, AAAI.

[43]  Celine Vens,et al.  Labelling strategies for hierarchical multi-label classification techniques , 2016, Pattern Recognit..

[44]  Ke-Wei Huang,et al.  A multilabel text classification algorithm for labeling risk factors in SEC form 10-K , 2011, ACM Trans. Manag. Inf. Syst..

[45]  Chris Cornelis,et al.  Fuzzy rough positive region based nearest neighbour classification , 2012, 2012 IEEE International Conference on Fuzzy Systems.

[46]  Zhaowei Shang,et al.  Multi-Label Learning With Fuzzy Hypergraph Regularization for Protein Subcellular Location Prediction , 2014, IEEE Transactions on NanoBioscience.

[47]  Witold Pedrycz,et al.  Multi-label classification by exploiting label correlations , 2014, Expert Syst. Appl..

[48]  Hichem Snoussi,et al.  A Dependent Multilabel Classification Method Derived from the k-Nearest Neighbor Rule , 2011, EURASIP J. Adv. Signal Process..

[49]  Sebastián Ventura,et al.  Multi‐label learning: a review of the state of the art and ongoing research , 2014, WIREs Data Mining Knowl. Discov..

[50]  Jianhua Xu AN EMPIRICAL COMPARISON OF WEIGHTING FUNCTIONS FOR MULTI -LABEL DISTANCE - WEIGHTED K-NEAREST NEIGHBOUR METHOD , 2011 .

[51]  Qinghua Hu,et al.  Neighborhood classifiers , 2008, Expert Syst. Appl..

[52]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[53]  Yiyu Yao,et al.  Relational Interpretations of Neigborhood Operators and Rough Set Approximation Operators , 1998, Inf. Sci..

[54]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[55]  Francisco Charte,et al.  Multilabel Classification: Problem Analysis, Metrics and Techniques , 2016 .

[56]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[57]  Eyke Hüllermeier,et al.  Case-Based Multilabel Ranking , 2007, IJCAI.

[58]  Francisco Herrera,et al.  Fuzzy Multi-Instance Classifiers , 2016, IEEE Transactions on Fuzzy Systems.

[59]  Grigorios Tsoumakas,et al.  An Empirical Study of Lazy Multilabel Classification Algorithms , 2008, SETN.

[60]  Chao Wu,et al.  RW.KNN: a proposed random walk KNN algorithm for multi-label classification , 2011, PIKM '11.

[61]  Nello Cristianini,et al.  Efficient classification of multi-labeled text streams by clashing , 2014, Expert Syst. Appl..