Feature Selection Ensemble for Symbolic Data Classification with AHP

The ensemble of feature selections facilitates to improve data generalization for learning tasks. However, existing feature selection ensemble methods have the following drawbacks. First, focusing on the numerical data, the works on the feature selection ensemble for symbolic and mixed-type data are very limited. Second, the voting-based ensemble strategies tend to select the top significant features but the consistency between the ensemble result and the diverse feature selections cannot be guaranteed. Aiming to handle these problems, we propose an feature selection ensemble method based on Analytic Hierarchy Process (AHP) in this paper. The AHP-based ensemble method can integrate diverse feature selections into a consistent one under the multiple criteria of feature discernibility and independence. Moreover, the ensemble methodology is helpful to implement the feature selection on distributed data and involve domain knowledge through extending criteria. Experimental results validate that the proposed ensemble method of feature selections is effective for symbolic data classification.

[1]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[2]  L. C. Leung,et al.  On consistency and ranking of alternatives in fuzzy AHP , 2000, Eur. J. Oper. Res..

[3]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[4]  T. Saaty How to Make a Decision: The Analytic Hierarchy Process , 1990 .

[5]  Qinghua Hu,et al.  EROS: Ensemble rough subspaces , 2007, Pattern Recognit..

[6]  Qinghua Hu,et al.  Consistency Based Attribute Reduction , 2007, PAKDD.

[7]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[8]  Chris H. Q. Ding,et al.  An Efficient Algorithm for Feature Selection with Feature Correlation , 2012, IScIDE.

[9]  Fei Chao,et al.  Feature Selection Inspired Classifier Ensemble Reduction , 2014, IEEE Transactions on Cybernetics.

[10]  Stephen D. Bay Nearest neighbor classification from multiple feature subsets , 1999, Intell. Data Anal..

[11]  Yi Peng,et al.  Ensemble of Software Defect Predictors: an AHP-Based Evaluation Method , 2011, Int. J. Inf. Technol. Decis. Mak..

[12]  Witold Pedrycz,et al.  Positive approximation: An accelerator for attribute reduction in rough set theory , 2010, Artif. Intell..

[13]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[14]  Yiyu Yao,et al.  Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model , 2009, Inf. Sci..

[15]  Qingxiang Wu,et al.  Multiknowledge for decision making , 2005, Knowledge and Information Systems.

[16]  Thomas L. Saaty,et al.  Decision-making with the AHP: Why is the principal eigenvector necessary , 2003, Eur. J. Oper. Res..

[17]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Fang Liu,et al.  A novel dynamic rough subspace based selective ensemble , 2015, Pattern Recognit..

[19]  Thomas L. Saaty What is the analytic hierarchy process , 1988 .

[20]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[21]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..

[22]  Lior Rokach,et al.  Genetic algorithm-based feature set partitioning for classification problems , 2008, Pattern Recognit..

[23]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[24]  Francis K. H. Quek,et al.  Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..

[25]  Andrzej Skowron,et al.  Rudiments of rough sets , 2007, Inf. Sci..

[26]  Marco Cristani,et al.  Infinite Feature Selection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Bo Du,et al.  Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding , 2015, Pattern Recognit..