论文信息 - An Algorithm for Automatic Generation of a Case Base from a Database Using Similarity-Based Rough Approximation

An Algorithm for Automatic Generation of a Case Base from a Database Using Similarity-Based Rough Approximation

Knowledge acquisition for a case-based reasoning system from domain experts is a bottleneck in the system development process. In recent years, huge amounts of data in many areas have become available. Therefore, deriving representative cases from available databases rather than from domain experts is feasible and promising. This paper presents an algorithm to derive cases automatically from available databases. This algorithm is based on the similarity-based rough set theory. It can tackle inconsistent data and select a reasonable number of the representative cases from a database. This algorithm was implemented in Java and the experiment results indicate that in some conditions the classification accuracy of the derived case base can be superior to some well-known data mining systems, such as rule induction systems and neural network systems.

Christine W. Chan | Liqiang Geng

[1] Jerzy W. Grzymala-Busse,et al. LERS-A System for Learning from Examples Based on Rough Sets , 1992, Intelligent Decision Support.

[2] Salvatore Greco,et al. Fuzzy Similarity Relation as a Basis for Rough Approximations , 1998, Rough Sets and Current Trends in Computing.

[3] Christine W. Chan,et al. Knowledge engineering for an intelligent case-based system for help desk operations , 2000 .

[4] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[5] Karen Ketler,et al. Case-based reasoning: An introduction , 1993 .

[6] Christopher J. Merz,et al. UCI Repository of Machine Learning Databases , 1996 .

[7] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[8] Wen-June Wang,et al. New similarity measures on fuzzy sets and on elements , 1997, Fuzzy Sets Syst..

[9] Z. Pawlak. Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[10] Xiaohua Hu,et al. Rough Sets Similarity-Based Learning from Databases , 1995, KDD.

[11] Jiawei Han,et al. Data Mining: Concepts and Techniques , 2000 .

[12] T. Ho,et al. A Rough Set Approach to Information Retrieval , 1998 .

[13] Krzysztof Skabek,et al. Rough Sets in Economic Applications , 1998 .

[14] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .