Fuzzy rough sets for self-labelling: An exploratory analysis

Semi-supervised learning incorporates aspects of both supervised and unsupervised learning. In semi-supervised classification, only some data instances have associated class labels, while others are unlabelled. One particular group of semi-supervised classification approaches are those known as self-labelling techniques, which attempt to assign class labels to the unlabelled data instances. This is achieved by using the class predictions based upon the information of the labelled part of the data. In this paper, the applicability and suitability of fuzzy rough set theory for the task of self-labelling is investigated. An important preparatory experimental study is presented that evaluates how accurately different fuzzy rough set models can predict the classes of unlabelled data instances for semi-supervised classification. The predictions are made either by considering only the labelled data instances or by involving the unlabelled data instances as well. A stability analysis of the predictions also helps to provide further insight into the characteristics of the different fuzzy rough models. Our study shows that the ordered weighted average based fuzzy rough model performs best in terms of both accuracy and stability. Our conclusions offer a solid foundation and rationale that will allow the construction of a fuzzy rough self-labelling technique. They also provide an understanding of the applicability of fuzzy rough sets for the task of semi-supervised classification in general.

[1]  Qinghua Hu,et al.  On Robust Fuzzy Rough Set Models , 2012, IEEE Transactions on Fuzzy Systems.

[2]  Robert LIN,et al.  NOTE ON FUZZY SETS , 2014 .

[3]  Chris Cornelis,et al.  A comprehensive study of implicator-conjunctor-based and noise-tolerant fuzzy rough sets: Definitions, properties and robustness analysis , 2015, Fuzzy Sets Syst..

[4]  María Teresa Lamata,et al.  Obtaining OWA operators starting from a linear order and preference quantifiers , 2012, Int. J. Intell. Syst..

[5]  Francisco Herrera,et al.  On the characterization of noise filters for self-training semi-supervised in nearest neighbor classification , 2014, Neurocomputing.

[6]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decisionmaking , 1988, IEEE Trans. Syst. Man Cybern..

[7]  Jesús Manuel Fernández Salido,et al.  On [beta]-Precision aggregation , 2003, Fuzzy Sets Syst..

[8]  Chris Cornelis,et al.  Semi-Supervised Fuzzy-Rough Feature Selection , 2015, RSFDGrC.

[9]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[10]  D. Dubois,et al.  ROUGH FUZZY SETS AND FUZZY ROUGH SETS , 1990 .

[11]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[12]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[13]  Chris Cornelis,et al.  Ordered Weighted Average Based Fuzzy Rough Sets , 2010, RSKT.

[14]  Anna Maria Radzikowska,et al.  A comparative study of fuzzy rough sets , 2002, Fuzzy Sets Syst..

[15]  Chris Cornelis,et al.  Applications of Fuzzy Rough Set Theory in Machine Learning: a Survey , 2015, Fundam. Informaticae.

[16]  De-gang Chen,et al.  The Model of Fuzzy Variable Precision Rough Sets , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[17]  Francisco Herrera,et al.  Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study , 2015, Knowledge and Information Systems.

[18]  Richard Jensen,et al.  Fuzzy-rough set based semi-supervised learning , 2011, 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011).