Neighborhood rough set reduction with fish swarm algorithm

Feature reduction refers to the problem of deleting those input features that are less predictive of a given outcome; a problem encountered in many areas such as pattern recognition, machine learning and data mining. In particular, it has been successfully applied in tasks that involve datasets containing huge numbers of features. Rough set theory has been used as such a data set preprocessor with much success, but current methods are inadequate at solving the problem of numerical feature reduction. As the classical rough set model can just be used to evaluate categorical features, we introduce a neighborhood rough set model to deal with numerical datasets by defining a neighborhood relation. However, this method is still not enough to find the optimal subsets regularly. In this paper, we propose a new feature reduction mechanism based on fish swarm algorithm (FSA) in an attempt to polish up this. The method is then applied to the problem of finding optimal feature subsets in the neighborhood rough set reduction process. We define three foraging behaviors of fish to find the optimal subsets and a fitness function to evaluate the best solutions. We construct the neighborhood feature reduction algorithm based on FSA and design some experiments comparing with a heuristic neighborhood feature reduction method. Experimental results show that the FSA-based neighborhood reduction method is suitable to deal with numerical data and more possibility to find an optimal reduct.

[1]  Yuan Tian,et al.  Chi-square Statistics Feature Selection Based on Term Frequency and Distribution for Text Categorization , 2015 .

[2]  Tsau Young Lin,et al.  First GrC model - Neighborhood Systems the most general rough set models , 2009, 2009 IEEE International Conference on Granular Computing.

[3]  Salvatore Greco,et al.  Rough approximation by dominance relations , 2002, Int. J. Intell. Syst..

[4]  Yiyu Yao,et al.  Covering based rough set approximations , 2012, Inf. Sci..

[5]  Qinghua Hu,et al.  Neighborhood rough set based heterogeneous feature subset selection , 2008, Inf. Sci..

[6]  Li Xiao-lei Parameter estimation method based-on artificial fish school algorithm , 2004 .

[7]  Yiyu Yao,et al.  Constructive and Algebraic Methods of the Theory of Rough Sets , 1998, Inf. Sci..

[8]  Li Xiao,et al.  An Optimizing Method Based on Autonomous Animats: Fish-swarm Algorithm , 2002 .

[9]  Andrzej Skowron,et al.  Rough set methods in feature selection and recognition , 2003, Pattern Recognit. Lett..

[10]  Li Pheng Khoo,et al.  Feature extraction using rough set theory and genetic algorithms--an application for the simplification of product quality evaluation , 2002 .

[11]  Li Xiao-lei,et al.  Applications of artificial fish school algorithm in combinatorial optimization problems , 2004 .

[12]  Gang Chen,et al.  Color Image Analysis by Quaternion-Type Moments , 2014, Journal of Mathematical Imaging and Vision.

[13]  Witold Pedrycz,et al.  Selecting Discrete and Continuous Features Based on Neighborhood Decision Error Minimization , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Duoqian Miao,et al.  A Comparison of Rough Set Methods and Representative Inductive Learning Algorithms , 2003, Fundam. Informaticae.

[15]  Qiang Shen,et al.  Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches , 2004, IEEE Transactions on Knowledge and Data Engineering.

[16]  Qiang Shen,et al.  Rough set-aided keyword reduction for text categorization , 2001, Appl. Artif. Intell..

[17]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[18]  Qinghua Hu,et al.  Mixed feature selection based on granulation and approximation , 2008, Knowl. Based Syst..

[19]  Daniel Vanderpooten,et al.  A Generalized Definition of Rough Approximations Based on Similarity , 2000, IEEE Trans. Knowl. Data Eng..

[20]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[21]  Chengsheng Yuan,et al.  Fingerprint liveness detection based on multi-scale LPQ and PCA , 2016, China Communications.

[22]  Fatemeh Alimardani,et al.  A Combinatorial Cooperative-Tabu Search Feature Reduction Approach , 2013, Sci. Iran..

[23]  T. Lin Granulation and nearest neighborhoods: rough set approach , 2001 .

[24]  Xizhao Wang,et al.  On the generalization of fuzzy rough sets , 2005, IEEE Transactions on Fuzzy Systems.

[25]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[26]  Ning Zhong,et al.  Using Rough Sets with Heuristics for Feature Selection , 1999, Journal of Intelligent Information Systems.

[27]  Ron Kohavi,et al.  Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology , 1995, KDD.

[28]  Marco Dorigo,et al.  Swarm intelligence: from natural to artificial systems , 1999 .

[29]  Shu-Lin Wang,et al.  Neighborhood Rough Set Reduction-Based Gene Selection and Prioritization for Gene Expression Profile Analysis and Molecular Cancer Classification , 2010, Journal of biomedicine & biotechnology.

[30]  Yu Xue,et al.  A Hybrid Evolutionary Algorithm for Numerical Optimization Problem , 2015, Intell. Autom. Soft Comput..

[31]  Ling Shao,et al.  A rapid learning algorithm for vehicle classification , 2015, Inf. Sci..

[32]  William Zhu,et al.  Relationship among basic concepts in covering-based rough sets , 2009, Inf. Sci..

[33]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[34]  H. Hannah Inbarani,et al.  A Novel Neighborhood Rough Set Based Classification Approach for Medical Diagnosis , 2015 .

[35]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[36]  Fei-Yue Wang,et al.  Properties of the Fourth Type of Covering-Based Rough Sets , 2006, 2006 Sixth International Conference on Hybrid Intelligent Systems (HIS'06).

[37]  Yiyu Yao,et al.  A Comparative Study of Fuzzy Sets and Rough Sets , 1998 .

[38]  Caihui Liu,et al.  Hierarchical attribute reduction algorithms for big data using MapReduce , 2015, Knowl. Based Syst..

[39]  James F. Peters,et al.  Tolerance spaces: Origins, theoretical aspects and applications , 2012, Inf. Sci..

[40]  Qinghua Hu,et al.  Fuzzy Probabilistic Approximation Spaces and Their Information Measures , 2006, IEEE Trans. Fuzzy Syst..

[41]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[42]  Jianzhou Wang,et al.  Time Series Forecasting Based on Novel Support Vector Machine Using Artificial Fish Swarm Algorithm , 2008, 2008 Fourth International Conference on Natural Computation.

[43]  Xuhui Chen,et al.  An entropy-based uncertainty measurement approach in neighborhood systems , 2014, Inf. Sci..

[44]  Qinghua Hu,et al.  Neighborhood classifiers , 2008, Expert Syst. Appl..

[45]  Qinghua Hu,et al.  Information-preserving hybrid data reduction based on fuzzy-rough techniques , 2006, Pattern Recognit. Lett..

[46]  Qiong Zhang,et al.  Rough Rule Extracting From Various Conditions: Incremental and Approximate Approaches for Inconsistent Data , 2008, Fundam. Informaticae.

[47]  Duoqian Miao,et al.  Reduction target structure-based hierarchical attribute reduction for two-category decision-theoretic rough sets , 2014, Inf. Sci..

[48]  Maciej Modrzejewski,et al.  Feature Selection Using Rough Sets Theory , 1993, ECML.

[49]  S. Chi,et al.  Determination of the Critical Slip Surface Using Artificial Fish Swarms Algorithm , 2008 .