An Unsupervised Learning Approach for Case-Based Classifier Systems

Case-Based Classifier Systems obtain low accuracies on generalisation and higher waste on CPU time when the class distribution space is not well defined. This paper presents the Mean Sphere and the Mean K-Means approach based on unsupervised learning to improve the CPU time and to improve or maintain the accuracy. We use clustering in an unsupervised way to decide what is the representational space of each class. The concept of clustering is introduced in two levels. The first level clusters the training data into spheres, obtaining one sphere for each class. The second level consists of clustering the spheres in order to detect the behaviour of the elements present in the sphere. In this level two policies are applied, Mean Sphere and Mean K-Means approaches. Experiments using different domains, most of them from the UCI repository, show that the CPU time is considerably decremented while maintaining, and sometimes improving, the accuracy of the system.

[1]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[2]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[3]  Xavier Llorà,et al.  Computer aided diagnosis with case-based reasoning and genetic algorithms , 2002, Knowl. Based Syst..

[4]  Steven Salzberg,et al.  A Nearest Hyperrectangle Learning Method , 1991, Machine Learning.

[5]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[6]  Elisabet Golobardes Ribé Aportacions al raonament basat en casos per resoldre problemes de classificacio , 1998 .

[7]  Hans-Peter Kriegel,et al.  A Database Interface for Clustering in Large Spatial Databases , 1995, KDD.

[8]  Jing Wu,et al.  Keep It Simple: A Case-Base Maintenance Policy Based on Clustering and Information Theory , 2000, Canadian Conference on AI.

[9]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[10]  Aidong Zhang,et al.  WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases , 1998, VLDB.

[11]  Peter C. Cheeseman,et al.  Bayesian Classification (AutoClass): Theory and Results , 1996, Advances in Knowledge Discovery and Data Mining.

[12]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[13]  Pedro M. Domingos Control-Sensitive Feature Selection for Lazy Learners , 1997, Artificial Intelligence Review.

[14]  Stephen José Hanson 9 – CONCEPTUAL CLUSTERING AND CATEGORIZATION: Bridging the Gap between Induction and Causal Models , 1990 .

[15]  Elisabet Golobardes,et al.  Automatic diagnosis with genetic algorithms and case-based reasoning , 1999, Artif. Intell. Eng..

[16]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[17]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[18]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[19]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[20]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.