A hybrid multiple feature construction approach for classification using Genetic Programming

Abstract The purpose of feature construction is to create new higher-level features from original ones. Genetic Programming (GP) was usually employed to perform feature construction tasks due to its flexible representation. Filter-based approach and wrapper-based approach are two commonly used feature construction approaches according to their different evaluation functions. In this paper, we propose a hybrid feature construction approach using genetic programming (Hybrid-GPFC) that combines filter’s fitness function and wrapper’s fitness function, and propose a multiple feature construction method that stores top excellent individuals during a single GP run. Experiments on ten datasets show that our proposed multiple feature construction method (Fcm) can achieve better (or equivalent) classification performance than the single feature construction method (Fcs), and our Hybrid-GPFC can obtain better classification performance than filter-based feature construction approaches (Filter-GPFC) and wrapper-based feature construction approaches (Wrapper-GPFC) in most cases. Further investigations on combinations of constructed features and original features show that constructed features augmented with original features do not improve the classification performance comparing with constructed features only. The comparisons with three state-of-art methods show that in majority of cases, our proposed hybrid multiple feature construction approach can achieve better classification performance.

[1]  Qing Zhang,et al.  Feature extraction and dimensionality reduction by genetic programming based on the Fisher criterion , 2008, Expert Syst. J. Knowl. Eng..

[2]  George D. Smith,et al.  Evolutionary constructive induction , 2005, IEEE Transactions on Knowledge and Data Engineering.

[3]  Bir Bhanu,et al.  Evolutionary feature synthesis for object recognition , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[4]  Douglas A. Wolfe,et al.  Nonparametric Statistical Methods , 1973 .

[5]  Kenli Li,et al.  Bi-objective workflow scheduling of the energy consumption and reliability in heterogeneous computing systems , 2017, Inf. Sci..

[6]  Mengjie Zhang,et al.  Genetic programming for feature construction and selection in classification on high-dimensional data , 2016, Memetic Comput..

[7]  Mengjie Zhang,et al.  A Filter Approach to Multiple Feature Construction for Symbolic Learning Classifiers Using Genetic Programming , 2012, IEEE Transactions on Evolutionary Computation.

[8]  Krzysztof Krawiec,et al.  Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks , 2002, Genetic Programming and Evolvable Machines.

[9]  Mengjie Zhang,et al.  Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms , 2014, Appl. Soft Comput..

[10]  Erik Goodman,et al.  On Prediction of Epileptic Seizures by Means of Genetic Programming Artificial Features , 2006, Annals of Biomedical Engineering.

[11]  K. De Jong,et al.  Effective Automated Feature Construction and Selection for Classification of Biological Sequences , 2014, PloS one.

[12]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[13]  Asoke K. Nandi,et al.  Feature generation using genetic programming with application to fault classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[15]  Kourosh Neshatian,et al.  Feature Manipulation with Genetic Programming , 2010 .

[16]  John R. Koza Genetic Programming III - Darwinian Invention and Problem Solving , 1999, Evolutionary Computation.

[17]  Krzysztof Krawiec,et al.  Generative learning of visual concepts using multiobjective genetic programming , 2007, Pattern Recognit. Lett..

[18]  Peter Nordin,et al.  Genetic programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications , 1998 .

[19]  Larry Bull,et al.  Genetic Programming with a Genetic Algorithm for Feature Construction and Selection , 2005, Genetic Programming and Evolvable Machines.

[20]  Asoke K. Nandi,et al.  Breast Cancer Diagnosis Using Genetic Programming Generated Feature , 2005, 2005 IEEE Workshop on Machine Learning for Signal Processing.

[21]  Joaquín Abellán,et al.  Credal-C4.5: Decision tree based on imprecise probabilities to classify noisy data , 2014, Expert Syst. Appl..

[22]  Francisco Herrera,et al.  A Survey on the Application of Genetic Programming to Classification , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[23]  Krzysztof Krawiec,et al.  Visual Learning by Evolutionary and Coevolutionary Feature Synthesis , 2007, IEEE Transactions on Evolutionary Computation.

[24]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.