Dimensionality reduction in face detection: A genetic programming approach

The high number of features in many machine vision applications has a major impact on the performance of machine learning algorithms. Feature selection (FS) is an avenue to dimensionality reduction. Evolutionary search techniques have been very promising in finding solutions in the exponentially growing search space of FS problems. This paper proposes a genetic programming (GP) approach to FS where the building blocks are subsets of features and set operators. We use bit-mask representation for subsets and a set of set operators as primitive functions. The GP search, then combines these subsets and set operations to find an optimal subset of features. The task we study is a highly imbalanced face detection problem. A modified version of the Na¨ıve Bayes classification model is used as the fitness function. Our results show that the proposed algorithm can achieve a significant reduction in dimensionality and processing time. Using the GP-selected features, the performance of certain classifiers can also be improved.

[1]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[2]  Ian Witten,et al.  Data Mining , 2000 .

[3]  Asoke K. Nandi,et al.  Feature generation using genetic programming with application to fault classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[4]  Massimiliano Pontil,et al.  Face Detection in Still Gray Images , 2000 .

[5]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Una-May O'Reilly,et al.  Genetic Programming II: Automatic Discovery of Reusable Programs. , 1994, Artificial Life.

[7]  Larry Bull,et al.  Genetic Programming with a Genetic Algorithm for Feature Construction and Selection , 2005, Genetic Programming and Evolvable Machines.

[8]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[9]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[10]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[11]  Larry Bull,et al.  Feature Construction and Selection Using Genetic Programming and a Genetic Algorithm , 2003, EuroGP.

[12]  Mengjie Zhang,et al.  Fitness Functions in Genetic Programming for Classification with Unbalanced Data , 2007, Australian Conference on Artificial Intelligence.

[13]  Krzysztof Krawiec,et al.  Visual learning by coevolutionary feature synthesis , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[15]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[16]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[17]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection , 1998 .

[18]  Bernhard Schölkopf,et al.  Feature selection for support vector machines by means of genetic algorithm , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[19]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Mariano Alvira,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No.XXXX C.B.C.L Paper No.XXX An Empirical Comparison of SNoW and SVMs For Face Detection , 2001 .

[21]  David A. Bell,et al.  A Formalism for Relevance and Its Application in Feature Subset Selection , 2000, Machine Learning.

[22]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[23]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[24]  Mengjie Zhang,et al.  Genetic Programming for Feature Subset Ranking in Binary Classification Problems , 2009, EuroGP.

[25]  Mengjie Zhang,et al.  Pareto front feature selection: using genetic programming to explore feature space , 2009, GECCO.