Data classification based on tolerant rough set

Abstract This paper proposes a new data classification method based on the tolerant rough set that extends the existing equivalent rough set. Similarity measure between two data is described by a distance function of all constituent attributes and they are defined to be tolerant when their similarity measure exceeds a similarity threshold value. The determination of optimal similarity threshold value is very important for the accurate classification. So, we determine it optimally by using the genetic algorithm (GA), where the goal of evolution is to balance two requirements such that (1) some tolerant objects are required to be included in the same class as many as possible and (2) some objects in the same class are required to be tolerable as much as possible. After finding the optimal similarity threshold value, a tolerant set of each object is obtained and the data set is grouped into the lower and upper approximation set depending on the coincidence of their classes. We propose a two-stage classification method that all data are classified by using the lower approximation at the first stage and then the non-classified data at the first stage are classified again by using the rough membership functions obtained from the upper approximation set. The validity of the proposed classification method is tested by applying it to the IRIS data classification and its classification performance and processing time are compared with those of other classification methods such as BPNN, OFUNN, and FCM.

[1]  Andrzej Skowron,et al.  Discovery of Data Patterns with Applications to Decomposition and Classification Problems , 1998 .

[2]  Zdzislaw Pawlak,et al.  VAGUENESS AND UNCERTAINTY: A ROUGH SET PERSPECTIVE , 1995, Comput. Intell..

[3]  Patrick K. Simpson,et al.  Fuzzy min-max neural networks. I. Classification , 1992, IEEE Trans. Neural Networks.

[4]  Anil K. Jain,et al.  A self-organizing network for hyperellipsoidal clustering (HEC) , 1996, IEEE Trans. Neural Networks.

[5]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[7]  Chulhyun Kim,et al.  Forecasting time series with genetic fuzzy predictor ensemble , 1997, IEEE Trans. Fuzzy Syst..

[8]  Chin-Teng Lin,et al.  Neural-Network-Based Fuzzy Logic Control and Decision System , 1991, IEEE Trans. Computers.

[9]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[10]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[11]  P. K. Simpson Fuzzy Min-Max Neural Networks-Part 1 : Classification , 1992 .

[12]  Kalyanmoy Deb,et al.  A Comparative Analysis of Selection Schemes Used in Genetic Algorithms , 1990, FOGA.

[13]  Stephen Grossberg,et al.  A massively parallel architecture for a self-organizing neural pattern recognition machine , 1988, Comput. Vis. Graph. Image Process..

[14]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[15]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[16]  Jerzy W. Grzymala-Busse,et al.  Rough sets : New horizons in commercial and industrial AI , 1995 .

[17]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[18]  H. Zimmermann,et al.  Fuzzy Set Theory and Its Applications , 1993 .