Image Categorization Using ESFS: A New Embedded Feature Selection Method Based on SFS

Feature subset selection is an important subject when training classifiers in Machine Learning (ML) problems. Too many input features in a ML problem may lead to the so-called “curse of dimensionality”, which describes the fact that the complexity of the classifier parameters adjustment during training increases exponentially with the number of features. Thus, ML algorithms are known to suffer from important decrease of the prediction accuracy when faced with many features that are not necessary. In this paper, we introduce a novel embedded feature selection method, called ESFS, which is inspired from the wrapper method SFS since it relies on the simple principle to add incrementally most relevant features. Its originality concerns the use of mass functions from the evidence theory that allows to merge elegantly the information carried by features, in an embedded way, and so leading to a lower computational cost than original SFS. This approach has successfully been applied to the domain of image categorization and has shown its effectiveness through the comparison with other feature selection methods.

[1]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[2]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[3]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[4]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[5]  Arthur P. Dempster,et al.  A Generalization of Bayesian Inference , 1968, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[6]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[7]  L. A. Smith,et al.  Feature Subset Selection: A Correlation Based Filter Approach , 1997, ICONIP.

[8]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[10]  I. Kojadinovic,et al.  Comparison between a filter and a wrapper approach to variable subset selection in regression problems , 2000 .

[11]  Paul Sajda,et al.  Role of feature selection in building pattern recognizers for computer-aided diagnosis , 1998, Medical Imaging.

[12]  Pavel Pudil,et al.  Oscillating search algorithms for feature selection , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[13]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[14]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[15]  Xiaoming Xu,et al.  A hybrid genetic algorithm for feature selection wrapper based on mutual information , 2007, Pattern Recognit. Lett..

[16]  Ricco Rakotomalala,et al.  TANAGRA : un logiciel gratuit pour l'enseignement et la recherche , 2005, EGC.

[17]  I. Jolliffe Principal Component Analysis , 2002 .

[18]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[19]  Alain Rakotomamonjy,et al.  Variable Selection Using SVM-based Criteria , 2003, J. Mach. Learn. Res..

[20]  Berthold Schweizer,et al.  Probabilistic Metric Spaces , 2011 .

[21]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[22]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[23]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[24]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[25]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[26]  K.Z. Mao,et al.  Orthogonal forward selection and backward elimination algorithms for feature subset selection , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[27]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[28]  Antonio Arauzo-Azofra,et al.  A feature set measure based on Relief , 2004 .

[29]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..