Fuzzy Case-Based System for Classification Tasks on Missing and Noisy Data

Hybrid models combine different technologies to obtain a product that shares their advantages and minimizes their deficiencies. The solutions given by a case-based system (CBS) rely on similar past experiences, which are commonly described in terms of both symbolic and continuous attributes. The nearest neighbor (NN) principle commonly followed to develop CBS for classification task proceeds from the assumption that similar cases have similar solutions, having the definition of the distance (similarity) function a central attention for obtaining a good accuracy on a given data set. This paper presents a hybrid model to solve classification tasks using NN principle but including generalized knowledge from the set of given instances to improve the performance in contrast to pure lazy learning algorithms. The fuzzy case-based system, referred as FuCiuS, interprets predictive numeric attributes in terms of fuzzy sets defining in a conceptually uniform way a one-dimensional (local) criterion to compare mixed and missing data. Experimental analysis show good performance for FuCiuS in comparison with well-known classifiers on missing and noisy data, while a more natural framework to include expert knowledge by using linguistic is provided guaranteeing both robustness and interpretable solutions.

[1]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[2]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[3]  Sankar K. Pal,et al.  Soft Computing in Case Based Reasoning , 2000, Springer London.

[4]  Ralph Bergmann,et al.  Developing Case-based Reasoning Applications: The INRECA-Methodology , 1999 .

[5]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[6]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[7]  Bernard De Baets,et al.  A Correlation-Based Distance Function for Nearest Neighbor Classification , 2008, CIARP.

[8]  Rafael Falcon,et al.  Learning Membership Functions for an Associative Fuzzy Neural Network , 2008 .

[9]  Bernard De Baets,et al.  A Connectionist Fuzzy Case-Based Reasoning Model , 2006, MICAI.

[10]  Ioannis Hatzilygeroudis,et al.  Integrating (rules, neural networks) and cases for knowledge representation and reasoning in expert systems , 2004, Expert Syst. Appl..

[11]  Ding-An Chiang,et al.  Correlation of fuzzy sets , 1999, Fuzzy Sets Syst..

[12]  Bernard De Baets,et al.  Extending a Hybrid CBR-ANN Model by Modeling Predictive Attributes Using Fuzzy Sets , 2006, IBERAMIA-SBIA.

[13]  David W. Aha,et al.  A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms , 1997, Artificial Intelligence Review.

[14]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[15]  R. A. M. O N L O P E Z D E M A N T A R A S,et al.  Retrieval, reuse, revision and retention in case-based reasoning , 2006 .

[16]  Janet L. Kolodner,et al.  An introduction to case-based reasoning , 1992, Artificial Intelligence Review.

[17]  Hung T. Nguyen,et al.  Possibility Theory, Probability and Fuzzy Sets Misunderstandings, Bridges and Gaps , 2000 .

[18]  Arkadiusz Wojna,et al.  Center-Based Indexing in Vector and Metric Spaces , 2002, Fundam. Informaticae.

[19]  Eyke Hüllermeier,et al.  Case-Based Approximate Reasoning , 2007, Theory and Decision Library.

[20]  Lotfi A. Zadeh,et al.  Soft computing and fuzzy logic , 1994, IEEE Software.

[21]  Rosina O. Weber Fuzzy set theory and uncertainty in case-based reasoning , 2006 .

[22]  David W. Aha,et al.  The omnipresence of case-based reasoning in science and application , 1998, Knowl. Based Syst..

[23]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[24]  Ian Witten,et al.  Data Mining , 2000 .

[25]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[26]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[27]  Terry R. Payne,et al.  Implicit Feature Selection with the Value Difference Metric , 1998, ECAI.