Ensemble classifier generation using non-uniform layered clustering and Genetic Algorithm

In this paper, we propose a novel cluster oriented ensemble classifier generation method and a Genetic Algorithm based approach to optimize the parameters. In the proposed method the data set is partitioned into a variable number of clusters at different layers. Base classifiers are trained on the clusters at different layers. Due to the variability of the number of clusters at different layers, the cluster compositions in one layer are different from that in another layer. Due to this difference in cluster contents, the base classifiers trained at different layers are diverse among each other. A test pattern is classified by the base classifier of the nearest cluster at each layer and the decisions from different layers are fused using majority voting. The accuracy of the proposed method depends on the number of layers and the number of clusters at the corresponding layer. A Genetic Algorithm based search is incorporated to obtain the optimal number of layers and clusters. The Genetic Algorithm is evaluated under three different objective functions: optimizing (i) accuracy, (ii) diversity, and (iii) accuracyxdiversity. We have conducted a number of experiments to evaluate the effectiveness of the different objective functions.

[1]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[2]  Lior Rokach,et al.  Space Decomposition in Data Mining: A Clustering Approach , 2002, ISMIS.

[3]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[4]  Huanhuan Chen,et al.  Regularized Negative Correlation Learning for Neural Network Ensembles , 2009, IEEE Transactions on Neural Networks.

[5]  Malcolm I. Heywood,et al.  Input partitioning to mixture of experts , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[6]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[7]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[8]  Huanhuan Chen,et al.  Multiobjective Neural Network Ensembles Based on Regularized Negative Correlation Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[9]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[10]  Takashi Yamaguchi,et al.  Artificial neural network ensemble-based land-cover classifiers using MODIS data , 2008, Artificial Life and Robotics.

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[13]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[14]  Xiangping Kang,et al.  A multi-instance ensemble learning model based on concept lattice , 2011, Knowl. Based Syst..

[15]  Hamid Parvin,et al.  Using Clustering for Generating Diversity in Classifier Ensemble , 2009, J. Digit. Content Technol. its Appl..

[16]  Colin G. Johnson,et al.  Introduction: Genetic Algorithms in Visual Art and Music , 2002 .

[17]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[19]  Jude W. Shavlik,et al.  Combining the Predictions of Multiple Classifiers: Using Competitive Learning to Initialize Neural Networks , 1995, IJCAI.

[20]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[21]  Grigorios Tsoumakas,et al.  Clustering based multi-label classification for image annotation and retrieval , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[22]  Ashfaqur Rahman,et al.  Cluster-based ensemble of classifiers , 2013, Expert Syst. J. Knowl. Eng..

[23]  Daniel Hernández-Lobato,et al.  Class-switching neural network ensembles , 2008, Neurocomputing.

[24]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[25]  Naoyuki Kanda,et al.  A multi-expert model for dialogue and behavior control of conversational robots and agents , 2011, Knowl. Based Syst..

[26]  Leo Breiman,et al.  Pasting Small Votes for Classification in Large Databases and On-Line , 1999, Machine Learning.

[27]  Ron Kohavi,et al.  Wrappers for performance enhancement and oblivious decision graphs , 1995 .

[28]  Juan José Rodríguez Diez,et al.  Random Subspace Ensembles for fMRI Classification , 2010, IEEE Transactions on Medical Imaging.

[29]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[30]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[31]  Lawrence O. Hall,et al.  Soft partitions lead to better learned ensembles , 2002, 2002 Annual Meeting of the North American Fuzzy Information Processing Society Proceedings. NAFIPS-FLINT 2002 (Cat. No. 02TH8622).

[32]  Giorgio Valentini,et al.  Bio-molecular cancer prediction with random subspace ensembles of support vector machines , 2005, Neurocomputing.

[33]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[34]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[35]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[36]  John A. Richards,et al.  Cluster-space classification: a fast k-nearest neighbour classification for remote sensing hyperspectral data , 2003, IEEE Workshop on Advances in Techniques for Analysis of Remotely Sensed Data, 2003.

[37]  Ashfaqur Rahman,et al.  Novel Layered Clustering-Based Approach for Generating Ensemble of Classifiers , 2011, IEEE Transactions on Neural Networks.

[38]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[39]  Ludmila I. Kuncheva,et al.  Clustering-and-selection model for classifier combination , 2000, KES'2000. Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies. Proceedings (Cat. No.00TH8516).

[40]  Xin Yao,et al.  Non-uniform Layered Clustering for Ensemble Classifier Generation and Optimality , 2010, ICONIP.

[41]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .