Feature selection using rough set-based direct dependency calculation by avoiding the positive region

Abstract Feature selection is the process of selecting a subset of features from the entire dataset such that the selected subset can be used on behalf of the entire dataset to reduce further processing. There are many approaches proposed for feature selection, and recently, rough set-based feature selection approaches have become dominant. The majority of such approaches use attribute dependency as criteria to determine the feature subsets. However, this measure uses the positive region to calculate dependency, which is a computationally expensive job, consequently effecting the performance of feature selection algorithms using this measure. In this paper, we have proposed a new heuristic-based dependency calculation method. The proposed method comprises a set of two rules called Direct Dependency Calculation (DDC) to calculate attribute dependency. Direct dependency calculates the number of unique/non-unique classes directly by using attribute values. Unique classes define accurate predictors of class, while non-unique classes are not accurate predictors. Calculating unique/non-unique classes in this manner lets us avoid the time-consuming calculation of the positive region, which helps increase the performance of subsequent algorithms. A two-dimensional grid was used as an intermediate data structure to calculate dependency. We have used the proposed method with a number of feature selection algorithms using various publically available datasets to justify the proposed method. A comparison framework was used for analysis purposes. Experimental results have shown the efficiency and effectiveness of the proposed method. It was determined that execution time was reduced by 63% for calculation of the dependency using DDCs, and a 65% decrease was observed in the case of feature selection algorithms based on DDCs. The required runtime memory was decreased by 95%.

[1]  Andrzej Skowron,et al.  Rudiments of rough sets , 2007, Inf. Sci..

[2]  Masahiro Inuiguchi,et al.  Structure-Based Attribute Reduction: A Rough Set Approach , 2015, Feature Selection for Data and Pattern Recognition.

[3]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[4]  Yaojin Lin,et al.  Matrix-based set approximations and reductions in covering decision information systems , 2015, Int. J. Approx. Reason..

[5]  Sunday O. Olatunji,et al.  Investigating the effect of correlation-based feature selection on the performance of support vector machines in reservoir characterization , 2015 .

[6]  Parham Moradi,et al.  Integration of graph clustering with ant colony optimization for feature selection , 2015, Knowl. Based Syst..

[7]  Yahya Slimani,et al.  A Novel RFE-SVM-based Feature Selection Approach for Classification , 2012 .

[8]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[9]  Ahmad Taher Azar,et al.  A novel hybrid feature selection method based on rough set and improved harmony search , 2015, Neural Computing and Applications.

[10]  Tzung-Pei Hong,et al.  Using group genetic algorithm to improve performance of attribute clustering , 2015, Appl. Soft Comput..

[11]  Henryk Rybinski,et al.  Rough Sets in Economy and Finance , 2014, Trans. Rough Sets.

[12]  Gholam Ali Montazer,et al.  Detection of phishing attacks in Iranian e-banking using a fuzzy-rough hybrid system , 2015, Appl. Soft Comput..

[13]  Francisco Maciá Pérez,et al.  Algorithm for the detection of outliers based on the theory of rough sets , 2015, Decis. Support Syst..

[14]  Chuanjian Yang,et al.  Quick general reduction algorithms for inconsistent decision tables , 2017, Int. J. Approx. Reason..

[15]  Diego Cabrera,et al.  Attribute clustering using rough set theory for feature selection in fault severity classification of rotating machinery , 2017, Expert Syst. Appl..

[16]  Bingru Yang,et al.  An Incremental Algorithm to Feature Selection in Decision Systems with the Variation of Feature Set , 2015 .

[17]  Jiye Liang,et al.  Fuzzy-rough feature selection accelerator , 2015, Fuzzy Sets Syst..

[18]  Irena Koprinska,et al.  Correlation and instance based feature selection for electricity load forecasting , 2015, Knowl. Based Syst..

[19]  Tianrui Li,et al.  An incremental attribute reduction approach based on knowledge granularity under the attribute generalization , 2016, Int. J. Approx. Reason..

[20]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[21]  Parham Moradi,et al.  A graph theoretic approach for unsupervised feature selection , 2015, Eng. Appl. Artif. Intell..

[22]  Usman Qamar,et al.  A hybrid feature selection approach based on heuristic and exhaustive algorithms using Rough set theory , 2016, ICC 2016.

[23]  Min Han,et al.  Global mutual information-based feature selection approach using single-objective and multi-objective optimization , 2015, Neurocomputing.

[24]  Xiao Zhang,et al.  Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy , 2016, Pattern Recognit..

[25]  Yiyu Yao,et al.  Generalized attribute reduct in rough set theory , 2016, Knowl. Based Syst..

[26]  Swagatam Das,et al.  Simultaneous feature selection and weighting - An evolutionary multi-objective optimization approach , 2015, Pattern Recognit. Lett..

[27]  Yang Yu,et al.  Minimal attribute reduction with rough set based on compactness discernibility information tree , 2015, Soft Computing.

[28]  Wenhao Shu,et al.  Mutual information criterion for feature selection from incomplete data , 2015, Neurocomputing.

[29]  Nicoletta Dessì,et al.  Similarity of feature selection methods: An empirical study across data intensive classification tasks , 2015, Expert Syst. Appl..

[30]  Ahmad Taher Azar,et al.  Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis , 2014, Comput. Methods Programs Biomed..

[31]  Dorra Sellami Masmoudi,et al.  Feature selection in possibilistic modeling , 2015, Pattern Recognit..

[32]  Yong-Jun Liu,et al.  Medical image segmentation using rough set and local polynomial regression , 2015, Multimedia Tools and Applications.

[33]  Usman Qamar,et al.  An incremental dependency calculation technique for feature selection using rough sets , 2016, Inf. Sci..

[34]  Qinghua Hu,et al.  An improved attribute reduction scheme with covering based rough sets , 2015, Appl. Soft Comput..

[35]  Tommy W. S. Chow,et al.  Heterogeneous feature subset selection using mutual information-based feature transformation , 2015, Neurocomputing.

[36]  Yumin Chen,et al.  Finding rough set reducts with fish swarm algorithm , 2015, Knowl. Based Syst..

[37]  V. Prasad,et al.  Thyroid disease diagnosis via hybrid architecture composing rough data sets theory and machine learning algorithms , 2016, Soft Comput..