GEP-Induced Expression Trees as Weak Classifiers

The paper proposes applying Gene Expression Programming (GEP) to induce expression trees used subsequently as weak classifiers. Two techniques of constructing ensemble classifiers from weak classifiers are investigated in the paper. The working hypothesis of the paper can be stated as follows: given a set of classifiers generated through applying gene expression programming method and using some variants of boosting technique, one can construct the ensemble producing effectively high quality classification results. A detailed description of the proposed GEP implementation generating classifiers in the form of expression trees is followed by the report on AdaBoost and boosting algorithms used to construct an ensemble classifier. To validate the approach computational experiment involving several benchmark datasets has been carried out. Experiment results show that using GEP-induced expression trees as weak classifiers allows for construction of a high quality ensemble classifier outperforming, in terms of classification accuracy, many other recently published solutions.

[1]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Weimin Xiao,et al.  Evolving accurate and compact classification rules with gene expression programming , 2003, IEEE Trans. Evol. Comput..

[3]  Cândida Ferreira,et al.  Gene Expression Programming: A New Adaptive Algorithm for Solving Problems , 2001, Complex Syst..

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[5]  Changjie Tang,et al.  Distance Guided Classification with Gene Expression Programming , 2006, ADMA.

[6]  João Gama,et al.  Local Cascade Generalization , 1998, International Conference on Machine Learning.

[7]  Mark Last,et al.  A compact and accurate model for classification , 2004, IEEE Transactions on Knowledge and Data Engineering.

[8]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[9]  K.R. Venugopal,et al.  Generic Feature Extraction for Classification using Fuzzy C - Means Clustering , 2005, 2005 3rd International Conference on Intelligent Sensing and Information Processing.

[10]  Cândida Ferreira Decision Tree Induction , 2006 .

[11]  Neil D. Lawrence,et al.  Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis , 2006, J. Mach. Learn. Res..

[12]  Berkman Sahiner,et al.  Dual system approach to computer-aided detection of breast masses on mammograms. , 2006, Medical physics.

[13]  Roberto Battiti,et al.  Democracy in neural nets: Voting schemes for classification , 1994, Neural Networks.

[14]  David G. Stork,et al.  Pattern Classification , 1973 .

[15]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[16]  Heitor Silvério Lopes,et al.  GEPCLASS: A Classification Rule Discovery Tool Using Gene Expression Programming , 2006, ADMA.

[17]  Weimin Xiao,et al.  Prefix Gene Expression Programming , 2005 .

[18]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Changing Environments , 2004, Multiple Classifier Systems.

[19]  Ching Y. Suen,et al.  Optimal combinations of pattern classifiers , 1995, Pattern Recognit. Lett..

[20]  B.V. Dasarathy,et al.  A composite classifier system design: Concepts and methodology , 1979, Proceedings of the IEEE.

[21]  Mu-Chen Chen,et al.  Credit scoring with a data mining approach based on support vector machines , 2007, Expert Syst. Appl..

[22]  L. Kocarev Chaos in circuits and systems [Book Review] , 2003, IEEE Circuits and Systems Magazine.

[23]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[24]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.

[25]  Candida Ferreira Gene expression programming , 2006 .

[26]  Weihong Wang,et al.  A Preliminary Study on Constructing Decision Tree with Gene Expression Programming , 2006, First International Conference on Innovative Computing, Information and Control - Volume I (ICICIC'06).

[27]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[28]  Andreas Stafylopatis,et al.  Data Mining based on Gene Expression Programming and Clonal Selection , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[29]  Paul N. Bennett,et al.  Building reliable metaclassifiers for text learning , 2006 .

[30]  Changjie Tang,et al.  A Model of Immune Gene Expression Programming for Rule Mining , 2007, J. Univers. Comput. Sci..

[31]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .