论文信息 - Decoupling of clustering and classification steps in a cluster-based classification

Decoupling of clustering and classification steps in a cluster-based classification

The application of cluster analysis in the "classification" area is well known. Such application takes place in two steps: "clustering" and "classification". In the clustering step, the objects of a training set are clustered using a cluster technique, Q. The outcome is a set of clusters, C. Each cluster, ci, is assigned a class label, ki, which reflects the common features of the objects in ci. The ki is a member of set K. In the classification step, a new object from a test set is assigned to one of the clusters in C using the Q, C, and K of the former step. The goal of this research effort is two fold: (1) introducing a methodology for decoupling "clustering" and "classification " steps and (2) establishing the validity of the proposed methodology by comparing its classification performance with the performance of the rough sets approach, and disciminant analysis.

Ray R. Hashemi | Mahmood Bahar | Alexander A. Tyler | Christopher Childers

[1] Ray R. Hashemi,et al. A Fuzzy Rough Sets Classifier for Database Mining , 2002 .

[2] L. Gold,et al. Prediction of carcinogenicity from two versus four sex-species groups in the carcinogenic potency database. , 1993, Journal of toxicology and environmental health.

[3] Ray R. Hashemi,et al. A Fusion of Rough Sets, Modified Rough Sets, and Genetic Algorithms for Hybrid Diagnostic Systems , 1997 .

[4] T. Kohonen. Self-Organized Formation of Correct Feature Maps , 1982 .

[5] Ray R. Hashemi,et al. An extended self-organizing map (ESOM) for hierarchical clustering , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[6] Ard,et al. Compendium of Chemical Carcinogens by Target Organ: Results of Chronic Bioassays in Rats, Mice, Hamsters, Dogs, and Monkeys , 2002 .

[7] E. Zeiger,et al. Handbook of Carcinogenic Potency and Genotoxicity Databases , 1996 .

[8] B. Ames,et al. The Carcinogenic Potency Database: analyses of 4000 chronic animal cancer experiments published in the general literature and by the U.S. National Cancer Institute/National Toxicology Program. , 1991, Environmental health perspectives.

[9] Weida Tong,et al. BUILDING AN ORGAN-SPECIFIC CARCINOGENIC DATABASE FOR SAR ANALYSES , 2004, Journal of toxicology and environmental health. Part A.

[10] B. Ames,et al. What do animal cancer tests tell us about human cancer risk?: Overview of analyses of the carcinogenic potency database. , 1998, Drug metabolism reviews.

[11] Teuvo Kohonen,et al. Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[12] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[13] Anil K. Jain,et al. Data clustering: a review , 1999, CSUR.

[14] Petra Perner,et al. Data Mining - Concepts and Techniques , 2002, Künstliche Intell..