Hybrid Classification Ensemble Using Topology-preserving Clustering

This study presents a novel hybrid intelligent system using both unsupervised and supervised learning that can be easily adapted to be used in an individual or collaborative system. The system divides the classification problem into two stages: firstly it divides the input data space into different parts, according to the input space distribution of the data set. Then, it generates several simple classifiers that are used to correctly classify samples that are contained in one of the previously determined parts. This way, the efficiency of each classifier increases, as they can specialize in classifying only related samples from certain regions of the input data space. This specialization of the single classifiers enables them to learn more specific patterns or characteristics of the data space, avoiding the risk of obtaining a general algorithm that over-fits to the data. The hybrid system presented has been tested with artificial and real data sets. A comparative study of the results obtained by the novel model with those obtained from other common classification methods is also included in the present work.

[1]  T. Kohonen,et al.  A principle of neural associative memory , 1977, Neuroscience.

[2]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[3]  Stefan C. Kremer,et al.  Clustering unlabeled data with SOMs improves classification of labeled real-world data , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[4]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[5]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[6]  David W. Aha,et al.  Simplifying decision trees: A survey , 1997, The Knowledge Engineering Review.

[7]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[8]  K. Goebel Choosing Classifiers for Decision Fusion , 2004 .

[9]  Tom Heskes,et al.  Clustering ensembles of neural network models , 2003, Neural Networks.

[10]  Noel E. Sharkey,et al.  Combining diverse neural nets , 1997, The Knowledge Engineering Review.

[11]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[12]  Anil K. Jain,et al.  Adaptive clustering ensembles , 2004, ICPR 2004.

[13]  Hujun Yin,et al.  ViSOM - a novel method for multivariate data projection and structure visualization , 2002, IEEE Trans. Neural Networks.

[14]  Alfred Ultsch,et al.  U *-Matrix : a Tool to visualize Clusters in high dimensional Data , 2004 .

[15]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[16]  Robert P. W. Duin,et al.  Limits on the majority vote accuracy in classifier fusion , 2003, Pattern Analysis & Applications.

[17]  Robi Polikar,et al.  Majority Vote and Decision Template Based Ensemble Classifiers Trained on Event Related Potentials for Early Diagnosis of Alzheimer's Disease , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[18]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[19]  Ludmila I. Kuncheva,et al.  Clustering-and-selection model for classifier combination , 2000, KES'2000. Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies. Proceedings (Cat. No.00TH8516).

[20]  Jouko Lampinen,et al.  Clustering properties of hierarchical self-organizing maps , 1992, Journal of Mathematical Imaging and Vision.

[21]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[22]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[23]  Michal Wozniak,et al.  Algorithm of designing compound recognition system on the basis of combining classifiers with simultaneous splitting feature space into competence areas , 2009, Pattern Analysis and Applications.

[24]  Louis Vuurpijl,et al.  An overview and comparison of voting methods for pattern recognition , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[25]  Baozong Yuan,et al.  Multiple classifiers combination by clustering and selection , 2001, Inf. Fusion.

[26]  Robert P. W. Duin,et al.  An experimental study on diversity for bagging and boosting with linear classifiers , 2002, Inf. Fusion.

[27]  Noel E. Sharkey,et al.  Diversity , Selection , and Ensembles of Arti cial Neural , 1997 .

[28]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[29]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[30]  Emilio Corchado,et al.  WeVoS-ViSOM: An ensemble summarization algorithm for enhanced data visualization , 2012, Neurocomputing.

[31]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[32]  Emilio Corchado,et al.  A weighted voting summarization of SOM ensembles , 2010, Data Mining and Knowledge Discovery.