A reduct derived from feature selection

In this paper, the relationship between a selected subset of attribute set of a decision system via feature selection by an optimal algorithm and a reduct of attribute set under the meaning of Pawlak's rough set is discussed. This selected subset is considered as a solution of the optimal algorithm. It is verified that a locally optimal solution is surely not a reduct while a reduct must be a globally optimal solution. Based on these assertions, a new optimal algorithm, called blindly deleting algorithm with an inverse ordering (BDAIO), is proposed to find a real reduct of a decision information system by remedying the selected attribute subset. Several standard data sets from UCI repository are implemented showing validity of the proposal.

[1]  Yair Shapira,et al.  Feature selection for multiple binary classification problems , 1999, Pattern Recognit. Lett..

[2]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[3]  Silvia Casado Yusta,et al.  Different metaheuristic strategies to solve the feature selection problem , 2009, Pattern Recognit. Lett..

[4]  Fadi Dornaika,et al.  Improving dynamic facial expression recognition with feature subset selection , 2011, Pattern Recognit. Lett..

[5]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[6]  Mark van der Gaag,et al.  The five-factor model of the Positive and Negative Syndrome Scale II: A ten-fold cross-validation of a revised model , 2006, Schizophrenia Research.

[7]  Muhammad Atif Tahir,et al.  Creating diverse nearest-neighbour ensembles using simultaneous metaheuristic feature selection , 2010, Pattern Recognit. Lett..

[8]  Yiyu Yao,et al.  Discernibility matrix simplification for constructing attribute reducts , 2009, Inf. Sci..

[9]  Qinghua Hu,et al.  Information-preserving hybrid data reduction based on fuzzy-rough techniques , 2006, Pattern Recognit. Lett..

[10]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[11]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[12]  Maciej Modrzejewski,et al.  Feature Selection Using Rough Sets Theory , 1993, ECML.

[13]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[14]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[16]  Ian R. Fasel,et al.  A learning approach to hierarchical feature selection and aggregation for audio classification , 2010, Pattern Recognit. Lett..

[17]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[18]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[19]  Lei Liu,et al.  Feature selection with dynamic mutual information , 2009, Pattern Recognit..

[20]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[21]  Qinghua Hu,et al.  Neighborhood rough set based heterogeneous feature subset selection , 2008, Inf. Sci..

[22]  Lipika Dey,et al.  A feature selection technique for classificatory analysis , 2005, Pattern Recognit. Lett..

[23]  Josef Kittler,et al.  An analysis of the Max-Min approach to feature selection and ordering , 1993, Pattern Recognit. Lett..

[24]  Andrzej Skowron,et al.  Rough set methods in feature selection and recognition , 2003, Pattern Recognit. Lett..

[25]  Victor L. Brailovsky,et al.  On domain knowledge and feature selection using a support vector machine , 1999, Pattern Recognit. Lett..

[26]  Dominik Slezak,et al.  Approximate Entropy Reducts , 2002, Fundam. Informaticae.

[27]  Paul Scheunders,et al.  Genetic feature selection combined with composite fuzzy nearest neighbor classifiers for hyperspectral satellite imagery , 2002, Pattern Recognit. Lett..

[28]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[29]  Zhangyan Xu,et al.  Efficient Attribute Reduction Algorithm Based on Skowron Discernibility Matrix , 2009, 2009 International Workshop on Intelligent Systems and Applications.

[30]  Dean M. Young,et al.  Optimal linear feature selection for a general class of statistical pattern recognition models , 1985, Pattern Recognit. Lett..

[31]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[32]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..