Evolutionary computation for feature manipulation: Key challenges and future directions

In machine learning and data mining, feature manipulation is a data pre-processing step to increase the quality of a feature space, which can significantly improve the performance of a learning algorithm in terms of the accuracy, the learning speed, and the complexity and the interpretability of the learnt models. However, feature manipulation is a difficult task and facing more challenges along with the trend that more and more data is collected in many domains. Evolutionary computation (EC) techniques have recently attracted much attention for dealing with complex feature manipulation problems. Current work has demonstrated some strengths of EC for feature manipulation, but also shown some limitations and issues that need to be addressed. More importantly, there are some highly interesting research topics in the EC for feature manipulation area, which could potentially result in promising approaches to data analysis in a variety of real-world applications. This position paper describes and discusses the main issues and key challenges of feature manipulation, and also provides a number of directions for further consideration in future research.

[1]  Kalyanmoy Deb,et al.  An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints , 2014, IEEE Transactions on Evolutionary Computation.

[2]  Xin Yao,et al.  A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.

[3]  Günter Rudolph,et al.  Theory of Evolutionary Algorithms: A Bird's Eye View , 1999, Theor. Comput. Sci..

[4]  Mengjie Zhang,et al.  Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms , 2014, Appl. Soft Comput..

[5]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognit. Lett..

[6]  Bernhard Sendhoff,et al.  Generalizing Surrogate-Assisted Evolutionary Computation , 2010, IEEE Transactions on Evolutionary Computation.

[7]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection , 1998 .

[8]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[9]  Bernhard Sendhoff,et al.  A framework for evolutionary optimization with approximate fitness functions , 2002, IEEE Trans. Evol. Comput..

[10]  Mengjie Zhang,et al.  Improved PSO for Feature Selection on High-Dimensional Datasets , 2014, SEAL.

[11]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[12]  Mengjie Zhang,et al.  A Filter Approach to Multiple Feature Construction for Symbolic Learning Classifiers Using Genetic Programming , 2012, IEEE Transactions on Evolutionary Computation.

[13]  Kay Chen Tan,et al.  A Multi-Facet Survey on Memetic Computation , 2011, IEEE Transactions on Evolutionary Computation.

[14]  Wentong Cai,et al.  Self-Learning Gene Expression Programming , 2016, IEEE Transactions on Evolutionary Computation.

[15]  Giovanni Acampora,et al.  Memetic Music Composition , 2016, IEEE Transactions on Evolutionary Computation.

[16]  Maoguo Gong,et al.  A Multiobjective Sparse Feature Learning Model for Deep Neural Networks , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Mengjie Zhang,et al.  Prediction of detectable peptides in MS data using genetic programming , 2014, GECCO.

[18]  John R. Koza,et al.  Human-competitive results produced by genetic programming , 2010, Genetic Programming and Evolvable Machines.

[19]  Jon Atli Benediktsson,et al.  A Novel Feature Selection Approach Based on FODPSO and SVM , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[20]  Francisco Herrera,et al.  A Survey on the Application of Genetic Programming to Classification , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[21]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[22]  Peter Tiño,et al.  Scaling Up Estimation of Distribution Algorithms for Continuous Optimization , 2011, IEEE Transactions on Evolutionary Computation.

[23]  P. N. Suganthan,et al.  Differential Evolution Algorithm With Strategy Adaptation for Global Numerical Optimization , 2009, IEEE Transactions on Evolutionary Computation.

[24]  Mengjie Zhang,et al.  Multi-objective Evolutionary Algorithms for filter Based Feature Selection in Classification , 2013, Int. J. Artif. Intell. Tools.

[25]  Dick den Hertog,et al.  Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming , 2009, IEEE Transactions on Evolutionary Computation.

[26]  Y. Ong,et al.  Multifactorial Evolution : Towards Evolutionary Multitasking , 2022 .

[27]  Yogesh R. Shepal A Fast Clustering-Based Feature Subset Selection Algorithm for High Dimensional Data , 2014 .

[28]  Mengjie Zhang,et al.  Two-Tier genetic programming: towards raw pixel-based image classification , 2012, Expert Syst. Appl..

[29]  Mengjie Zhang,et al.  A Comprehensive Comparison on Evolutionary Feature Selection Approaches to Classification , 2015, Int. J. Comput. Intell. Appl..

[30]  Ujjwal Maulik,et al.  A Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part I , 2014, IEEE Transactions on Evolutionary Computation.

[31]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[32]  Wendy Johnson,et al.  Introduction to Evolutionary Computation (lesson & activity) , 2012 .

[33]  Ivor W. Tsang,et al.  Discovering Support and Affiliated Features from Very High Dimensions , 2012, ICML.

[34]  Mohd Saberi Mohamad,et al.  A Modified Binary Particle Swarm Optimization for Selecting the Small Subset of Informative Genes From Gene Expression Data , 2011, IEEE Transactions on Information Technology in Biomedicine.

[35]  Krzysztof Krawiec,et al.  Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks , 2002, Genetic Programming and Evolvable Machines.

[36]  Carlos A. Coello Coello,et al.  Objective reduction using a feature selection technique , 2008, GECCO '08.

[37]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[38]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[39]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[40]  Loo Hay Lee,et al.  Memetic Algorithm for Real-Time Combinatorial Stochastic Simulation Optimization Problems With Performance Analysis , 2013, IEEE Transactions on Cybernetics.

[41]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[42]  Mark Johnston,et al.  Evolving Diverse Ensembles Using Genetic Programming for Classification With Unbalanced Data , 2013, IEEE Transactions on Evolutionary Computation.

[43]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[44]  Xiaodong Li,et al.  Cooperative Co-Evolution With Differential Grouping for Large Scale Optimization , 2014, IEEE Transactions on Evolutionary Computation.

[45]  Huan Liu,et al.  Feature subset selection bias for classification learning , 2006, ICML.

[46]  M. El-Sharkawi,et al.  Introduction to Evolutionary Computation , 2008 .

[47]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Mark Hoogendoorn,et al.  Parameter Control in Evolutionary Algorithms: Trends and Challenges , 2015, IEEE Transactions on Evolutionary Computation.

[49]  Yaochu Jin,et al.  Evolutionary Multiobjective Image Feature Extraction in the Presence of Noise , 2015, IEEE Transactions on Cybernetics.

[50]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  Qingfu Zhang,et al.  A Gaussian Process Surrogate Model Assisted Evolutionary Algorithm for Medium Scale Expensive Optimization Problems , 2014, IEEE Transactions on Evolutionary Computation.

[52]  Yang Yu,et al.  Switch Analysis for Running Time Analysis of Evolutionary Algorithms , 2015, IEEE Transactions on Evolutionary Computation.

[53]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[54]  Krzysztof Krawiec,et al.  Visual Learning by Evolutionary and Coevolutionary Feature Synthesis , 2007, IEEE Transactions on Evolutionary Computation.

[55]  Jun Zhang,et al.  Evolutionary Computation Meets Machine Learning: A Survey , 2011, IEEE Computational Intelligence Magazine.

[56]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[57]  Simon C. K. Shiu,et al.  Unsupervised feature selection by regularized self-representation , 2015, Pattern Recognit..

[58]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[59]  Ujjwal Maulik,et al.  Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part II , 2014, IEEE Transactions on Evolutionary Computation.

[60]  Ivor W. Tsang,et al.  A Feature Selection Method for Multivariate Performance Measures , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  Bernhard Sendhoff,et al.  Evolution by Adapting Surrogates , 2013, Evolutionary Computation.

[62]  Ivor W. Tsang,et al.  Minimax Sparse Logistic Regression for Very High-Dimensional Feature Selection , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[63]  Adam Prügel-Bennett,et al.  On the Landscape of Combinatorial Optimization Problems , 2014, IEEE Transactions on Evolutionary Computation.

[64]  Hisao Ishibuchi,et al.  Multi-objective pattern and feature selection by a genetic algorithm , 2000, GECCO.

[65]  Ivor W. Tsang,et al.  The Emerging "Big Dimensionality" , 2014, IEEE Computational Intelligence Magazine.

[66]  Andries Petrus Engelbrecht,et al.  Particle swarm optimization method for image clustering , 2005, Int. J. Pattern Recognit. Artif. Intell..

[67]  Qingfu Zhang,et al.  An Evolutionary Many-Objective Optimization Algorithm Based on Dominance and Decomposition , 2015, IEEE Transactions on Evolutionary Computation.

[68]  S. Billings,et al.  Feature Subset Selection and Ranking for Data Dimensionality Reduction , 2007 .

[69]  A. E. Eiben,et al.  From evolutionary computation to the evolution of things , 2015, Nature.

[70]  Mengjie Zhang,et al.  Enhanced feature selection for biomarker discovery in LC-MS data using GP , 2013, 2013 IEEE Congress on Evolutionary Computation.

[71]  Huan Liu,et al.  Instance Selection and Construction for Data Mining , 2001 .

[72]  Mengjie Zhang,et al.  Investigation on particle swarm optimisation for feature selection on high-dimensional data: local search and selection bias , 2016, Connect. Sci..

[73]  Mengjie Zhang,et al.  A Subset Similarity Guided Method for Multi-objective Feature Selection , 2016, ACALCI.

[74]  Mengjie Zhang,et al.  Genetic programming for feature construction and selection in classification on high-dimensional data , 2016, Memetic Comput..