A set-cover-based approach for the test-cost-sensitive attribute reduction problem

In data mining application, the test-cost-sensitive attribute reduction is an important task which aims to decrease the test cost of data. In operational research, the set cover problem is a typical optimization problem and has a long investigation history compared to the attribute reduction problem. In this paper, we employ the methods of set cover problem to deal with the test-cost-sensitive attribute reduction. First, we equivalently transform the test-cost-sensitive reduction problem into the set cover problem by using a constructive approach. It is shown that computing a reduct of a decision system with minimal test cost is equal to computing an optimal solution of the set cover problem. Then, a set-cover-based heuristic algorithm is introduced to solve the test-cost-sensitive reduction problem. In the end, we conduct several numerical experiments on data sets from UCI machine learning repository. Experimental results indicate that the set-cover-based algorithm has superior performances in most cases, and the algorithm is efficient on data sets with many attributes.

[1]  Masahiro Inuiguchi,et al.  A unified approach to reducts in dominance-based rough set approach , 2010, Soft Comput..

[2]  Xin Yao,et al.  European Journal of Operational Research an Efficient Local Search Heuristic with Row Weighting for the Unicost Set Covering Problem , 2022 .

[3]  Fan Min,et al.  A hierarchical model for test-cost-sensitive decision systems , 2009, Inf. Sci..

[4]  Yiyu Yao,et al.  Discernibility matrix simplification for constructing attribute reducts , 2009, Inf. Sci..

[5]  Vasek Chvátal,et al.  A Greedy Heuristic for the Set-Covering Problem , 1979, Math. Oper. Res..

[6]  Yuhua Qian,et al.  Test-cost-sensitive attribute reduction , 2011, Inf. Sci..

[7]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[8]  Yee Leung,et al.  Dependence-space-based attribute reduction in consistent decision tables , 2011, Soft Comput..

[9]  Peter Slavík A Tight Analysis of the Greedy Algorithm for Set Cover , 1997, J. Algorithms.

[10]  Qinghua Hu,et al.  Feature selection with test cost constraint , 2012, ArXiv.

[11]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[12]  Hong Zhao,et al.  Optimal cost-sensitive granularization based on rough sets for variable costs , 2014, Knowl. Based Syst..

[13]  James Nga-Kwok Liu,et al.  A set covering based approach to find the reduct of variable precision rough set , 2014, Inf. Sci..

[14]  William Zhu,et al.  Attribute reduction of data with error ranges and test costs , 2012, Inf. Sci..

[15]  Zhenmin Tang,et al.  Minimum cost attribute reduction in decision-theoretic rough set models , 2013, Inf. Sci..

[16]  Lei Zhang,et al.  Sample Pair Selection for Attribute Reduction with Rough Set , 2012, IEEE Transactions on Knowledge and Data Engineering.

[17]  Dominik Slezak,et al.  Approximate Entropy Reducts , 2002, Fundam. Informaticae.

[18]  林耀进,et al.  The relationship between attribute reducts in rough sets and minimal vertex covers of graphs , 2015 .

[19]  Matteo Fischetti,et al.  Algorithms for the Set Covering Problem , 2000, Ann. Oper. Res..

[20]  Hong Zhao,et al.  Test-cost-sensitive attribute reduction on heterogeneous data for adaptive neighborhood model , 2016, Soft Comput..

[21]  Qinghua Hu,et al.  Feature Selection for Monotonic Classification , 2012, IEEE Transactions on Fuzzy Systems.

[22]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[23]  Jing-Yu Yang,et al.  Test cost sensitive multigranulation rough set: Model and minimal cost selection , 2013, Inf. Sci..

[24]  Yiyu Yao,et al.  Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model , 2009, Inf. Sci..

[25]  Jiye Liang,et al.  The Information Entropy, Rough Entropy And Knowledge Granulation In Rough Set Theory , 2004, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[26]  Nada Lavrac,et al.  Cost-Sensitive Feature Reduction Applied to a Hybrid Genetic Algorithm , 1996, ALT.

[27]  Witold Pedrycz,et al.  Positive approximation: An accelerator for attribute reduction in rough set theory , 2010, Artif. Intell..

[28]  Si-Yuan Jing,et al.  A hybrid genetic algorithm for feature subset selection in rough set theory , 2014, Soft Comput..

[29]  Yitian Xu,et al.  A dynamic attribute reduction algorithm based on 0-1 integer programming , 2011, Knowl. Based Syst..

[30]  Jiye Liang,et al.  Fuzzy-rough feature selection accelerator , 2015, Fuzzy Sets Syst..

[31]  Jiye Liang,et al.  Incomplete Multigranulation Rough Set , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[32]  Verónica Bolón-Canedo,et al.  A framework for cost-based feature selection , 2014, Pattern Recognit..