A first study on the noise impact in classes for Fuzzy Rule Based Classification Systems

The presence of noise is common in any real data set and may adversely affect the accuracy, construction time and complexity of the classifiers. Models built by Fuzzy Rule Based Classification Systems are recognised for their interpretability, but traditionally these methods have not considered the presence of noise in the data, so it would be interesting to quantify its effect on them. The aim of this contribution is to study the behavior and robustness of Fuzzy Rule Based Classification Systems in presence of noise. In order to do this, 69 synthetic data sets have been created from 23 data sets from the UCI repository. Different levels of noise have been introduced artificially in the class in order to analyze the FRBCSs when noise is present. The methods of Chi et al. and PDFC have been considered as a case study, analyzing the accuracy of the models created. From the results obtained, it is possible to deduce that Fuzzy Rule Based Classification Systems have a good tolerance to class noise.

[1]  Veda C. Storey,et al.  A Framework for Analysis of Data Quality Research , 1995, IEEE Trans. Knowl. Data Eng..

[2]  Jerry M. Mendel,et al.  Generating fuzzy rules by learning from examples , 1992, IEEE Trans. Syst. Man Cybern..

[3]  Carla E. Brodley,et al.  Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[4]  Hisao Ishibuchi,et al.  Classification and modeling with linguistic information granules - advanced approaches to linguistic data mining , 2004, Advanced information processing.

[5]  Michael V. Mannino,et al.  Classification algorithm sensitivity to training data with non representative attribute noise , 2009, Decis. Support Syst..

[6]  Yixin Chen,et al.  Support vector learning for fuzzy rule-based classification systems , 2003, IEEE Trans. Fuzzy Syst..

[7]  Plamen P. Angelov,et al.  Evolving Fuzzy-Rule-Based Classifiers From Data Streams , 2008, IEEE Transactions on Fuzzy Systems.

[8]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[9]  Ken Orr,et al.  Data quality and systems theory , 1998, CACM.

[10]  Xindong Wu,et al.  Eliminating Class Noise in Large Datasets , 2003, ICML.

[11]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[12]  Eghbal G. Mansoori,et al.  SGERD: A Steady-State Genetic Algorithm for Extracting Fuzzy Classification Rules From Data , 2008, IEEE Transactions on Fuzzy Systems.

[13]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[14]  Eyke Hüllermeier,et al.  FR3: A Fuzzy Rule Learner for Inducing Reliable Classifiers , 2009, IEEE Transactions on Fuzzy Systems.

[15]  Andrian Marcus,et al.  Data Cleansing: Beyond Integrity Analysis , 2000, IQ.

[16]  Ludmila I. Kuncheva,et al.  Fuzzy Classifier Design , 2000, Studies in Fuzziness and Soft Computing.

[17]  Boris Breši Knowledge Acquisition in Databases , 2012 .

[18]  Andrian Marcus,et al.  Data Cleansing: Beyond Integrity Analysis 1 , 2000 .

[19]  María José del Jesús,et al.  On the influence of an adaptive inference system in fuzzy rule based classification systems for imbalanced data-sets , 2009, Expert Syst. Appl..

[20]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[21]  Xindong Wu,et al.  Mining With Noise Knowledge: Error-Aware Data Mining , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[22]  Xingquan Zhu,et al.  Class Noise vs. Attribute Noise: A Quantitative Study , 2003, Artificial Intelligence Review.

[23]  Hong Yan,et al.  Fuzzy Algorithms: With Applications to Image Processing and Pattern Recognition , 1996, Advances in Fuzzy Systems - Applications and Theory.

[24]  D. Dubois,et al.  Operations on fuzzy numbers , 1978 .

[25]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[26]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[27]  Hisao Ishibuchi,et al.  Effect of rule weights in fuzzy rule-based classification systems , 2001, IEEE Trans. Fuzzy Syst..