Feature selection with test cost constraint

Feature selection is an important preprocessing step in machine learning and data mining. In real-world applications, costs, including money, time and other resources, are required to acquire the features. In some cases, there is a test cost constraint due to limited resources. We shall deliberately select an informative and cheap feature subset for classification. This paper proposes the feature selection with test cost constraint problem for this issue. The new problem has a simple form while described as a constraint satisfaction problem (CSP). Backtracking is a general algorithm for CSP, and it is efficient in solving the new problem on medium-sized data. As the backtracking algorithm is not scalable to large datasets, a heuristic algorithm is also developed. Experimental results show that the heuristic algorithm can find the optimal solution in most cases. We also redefine some existing feature selection problems in rough sets, especially in decision-theoretic rough sets, from the viewpoint of CSP. These new definitions provide insight to some new research directions.

[1]  Yumin Chen,et al.  A rough set approach to feature selection based on power set tree , 2011, Knowl. Based Syst..

[2]  Qingzhong Liu,et al.  A Comparison Study of Cost-Sensitive Classifier Evaluations , 2012, Brain Informatics.

[3]  Hung Son Nguyen,et al.  Discretization Problem for Rough Sets Methods , 1998, Rough Sets and Current Trends in Computing.

[4]  Nouman Azam,et al.  Analyzing uncertainties of probabilistic rough set regions with game-theoretic rough sets , 2014, Int. J. Approx. Reason..

[5]  Fan Min,et al.  Rough sets approach to symbolic value partition , 2008, Int. J. Approx. Reason..

[6]  Hong Zhao,et al.  Test-cost-sensitive attribute reduction based on neighborhood rough set , 2011, 2011 IEEE International Conference on Granular Computing.

[7]  Yiyu Yao,et al.  A Partition Model of Granular Computing , 2004, Trans. Rough Sets.

[8]  G. Y. Wang Attribute Core of Decision Table , 2002, Rough Sets and Current Trends in Computing.

[9]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[10]  Pradipta Maji,et al.  Rough set based maximum relevance-maximum significance criterion and Gene selection from microarray data , 2011, Int. J. Approx. Reason..

[11]  Dominik Slezak,et al.  Approximate Entropy Reducts , 2002, Fundam. Informaticae.

[12]  Fan Min,et al.  The M-Relative Reduct Problem , 2006, RSKT.

[13]  William Zhu,et al.  Optimal Sub-Reducts with Test Cost Constraint , 2011, RSKT.

[14]  XU Feng-sheng,et al.  New discernibility matrix and computation of core , 2007 .

[15]  Xiao-Jun Zeng,et al.  Core-generating approximate minimum entropy discretization for rough set feature selection in pattern classification , 2011, Int. J. Approx. Reason..

[16]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[17]  Dun Liu,et al.  A Multiple-category Classification Approach with Decision-theoretic Rough Sets , 2012, Fundam. Informaticae.

[18]  Fei-Yue Wang,et al.  Reduction and axiomization of covering generalized rough sets , 2003, Inf. Sci..

[19]  Hu Qing,et al.  Numerical Attribute Reduction Based on Neighborhood Granulation and Rough Approximation , 2008 .

[20]  杨晓平,et al.  An axiomatic characterization of probabilistic rough sets , 2014 .

[21]  Witold Pedrycz,et al.  Selecting Discrete and Continuous Features Based on Neighborhood Decision Error Minimization , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  Jianhua Dai,et al.  Uncertainty measurement for interval-valued decision systems based on extended conditional entropy , 2012, Knowl. Based Syst..

[23]  Fan Min,et al.  Accumulated Cost Based Test-Cost-Sensitive Attribute Reduction , 2011, RSFDGrC.

[24]  Yiyu Yao,et al.  A Two-Phase Model for Learning Rules from Incomplete Data , 2009, Fundam. Informaticae.

[25]  Jerzy W. Grzymala-Busse,et al.  Generalized probabilistic approximations of incomplete data , 2014, Int. J. Approx. Reason..

[26]  Fan Min,et al.  Minimal Attribute Space Bias for Attribute Reduction , 2007, RSKT.

[27]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[28]  Decui Liang,et al.  Incorporating logistic regression to decision-theoretic rough sets for classifications , 2014, Int. J. Approx. Reason..

[29]  Jingtao Yao,et al.  Game-Theoretic Rough Sets , 2011, Fundam. Informaticae.

[30]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[31]  Yiyu Yao,et al.  On Reduct Construction Algorithms , 2006, Trans. Comput. Sci..

[32]  Andrzej Skowron,et al.  Dynamic Reducts as a Tool for Extracting Laws from Decisions Tables , 1994, ISMIS.

[33]  Guoyin Wang,et al.  An automatic method to determine the number of clusters using decision-theoretic rough set , 2014, Int. J. Approx. Reason..

[34]  Huang Qinghua,et al.  Numerical Attribute Reduction Based on Neighborhood Granulation and Rough Approximation , 2008 .

[35]  Jingtao Yao,et al.  Modelling Multi-agent Three-way Decisions with Decision-theoretic Rough Sets , 2012, Fundam. Informaticae.

[36]  Qing-Hua Hu,et al.  Numerical Attribute Reduction Based on Neighborhood Granulation and Rough Approximation: Numerical Attribute Reduction Based on Neighborhood Granulation and Rough Approximation , 2008 .

[37]  Guoyin Wang,et al.  Solving the Attribute Reduction Problem with Ant Colony Optimization , 2011, Trans. Rough Sets.

[38]  Masoud Nikravesh,et al.  Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing) , 2006 .

[39]  Jiajun Chen,et al.  An Optimization Viewpoint of Decision-Theoretic Rough Set Model , 2011, RSKT.

[40]  Yiyu Yao,et al.  A Decision Theoretic Framework for Approximating Concepts , 1992, Int. J. Man Mach. Stud..

[41]  Wojciech Ziarko,et al.  Variable Precision Rough Set Model , 1993, J. Comput. Syst. Sci..

[42]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[43]  Nouman Azam,et al.  Multiple Criteria Decision Analysis with Game-Theoretic Rough Sets , 2012, RSKT.

[44]  Yiyu Yao,et al.  Attribute reduction in decision-theoretic rough set models , 2008, Inf. Sci..

[45]  Daniel Vanderpooten,et al.  A Generalized Definition of Rough Approximations Based on Similarity , 2000, IEEE Trans. Knowl. Data Eng..

[46]  Jiye Liang,et al.  International Journal of Approximate Reasoning an Efficient Rough Feature Selection Algorithm with a Multi-granulation View , 2022 .

[47]  Dominik Slezak,et al.  The investigation of the Bayesian rough set model , 2005, Int. J. Approx. Reason..

[48]  Naftali Tishby,et al.  Margin based feature selection - theory and algorithms , 2004, ICML.

[49]  Chen Zhao-jiong,et al.  A New Discernibility Matrix and the Computation of a Core , 2002 .

[50]  Dun Liu,et al.  Attribute Reduction in Decision-Theoretic Rough Set Model: A Further Investigation , 2011, RSKT.

[51]  Roman Słowiński,et al.  Rough Sets and Current Trends in Computing , 2012, Lecture Notes in Computer Science.

[52]  Ning Zhong,et al.  Using Rough Sets with Heuristics for Feature Selection , 1999, Journal of Intelligent Information Systems.

[53]  Dayong Deng,et al.  Parallel reduct and its properties , 2009, 2009 IEEE International Conference on Granular Computing.

[54]  William Zhu,et al.  Optimal sub-reducts in the dynamic environment , 2011, 2011 IEEE International Conference on Granular Computing.

[55]  Shichao Zhang,et al.  Cost-sensitive classification with respect to waiting cost , 2010, Knowl. Based Syst..

[56]  Yiyu Yao,et al.  A Model of User-Oriented Reduct Construction for Machine Learning , 2008, Trans. Rough Sets.

[57]  William Zhu,et al.  A genetic algorithm to the minimal test cost reduct problem , 2011, 2011 IEEE International Conference on Granular Computing.

[58]  Ming Zhang,et al.  Feature Selection with Adjustable Criteria , 2005, RSFDGrC.

[59]  Tong-Jun Li,et al.  An axiomatic characterization of probabilistic rough sets , 2014, Int. J. Approx. Reason..

[60]  Robert Susmaga Computation of Minimal Cost Reducts , 1999, ISMIS.

[61]  Degang Chen,et al.  Fuzzy rough set theory for the interval-valued fuzzy information systems , 2008, Inf. Sci..

[62]  Jiye Liang,et al.  International Journal of Approximate Reasoning Multigranulation Decision-theoretic Rough Sets , 2022 .

[63]  William Zhu,et al.  Topological approaches to covering rough sets , 2007, Inf. Sci..

[64]  F. Min,et al.  Attribute Reduction with Test Cost Constraint , 2011 .

[65]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[66]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[67]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[68]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[69]  Fan Min,et al.  A hierarchical model for test-cost-sensitive decision systems , 2009, Inf. Sci..

[70]  M. Gorzałczany Interval-valued fuzzy controller based on verbal model of object , 1988 .

[71]  Qinghua Hu,et al.  Neighborhood rough set based heterogeneous feature subset selection , 2008, Inf. Sci..

[72]  Qinghua Hu,et al.  Mixed feature selection based on granulation and approximation , 2008, Knowl. Based Syst..

[73]  Huaxiong Li,et al.  Risk Decision Making Based on Decision-theoretic Rough Set: A Three-way View Decision Model , 2011, Int. J. Comput. Intell. Syst..

[74]  Yiyu Yao,et al.  Probabilistic rough set approximations , 2008, Int. J. Approx. Reason..

[75]  Qiang Shen,et al.  Finding Rough Set Reducts with SAT , 2005, RSFDGrC.

[76]  William Zhu,et al.  Attribute reduction of data with error ranges and test costs , 2012, Inf. Sci..

[77]  Yiyu Yao,et al.  A General Definition of an Attribute Reduct , 2007, RSKT.

[78]  Da Ruan,et al.  Probabilistic model criteria with decision-theoretic rough sets , 2011, Inf. Sci..

[79]  Wang Guo,et al.  Decision Table Reduction based on Conditional Information Entropy , 2002 .

[80]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[81]  Yuhua Qian,et al.  Test-cost-sensitive attribute reduction , 2011, Inf. Sci..

[82]  Zhenmin Tang,et al.  On an optimization representation of decision-theoretic rough set model , 2014, Int. J. Approx. Reason..

[83]  Wei-Zhi Wu,et al.  Approaches to knowledge reduction based on variable precision rough set model , 2004, Inf. Sci..

[84]  Yiyu Yao,et al.  Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model , 2009, Inf. Sci..

[85]  W. Li,et al.  Hybrid approaches to attribute reduction based on indiscernibility and discernibility relation , 2011, Int. J. Approx. Reason..

[86]  Jianhua Dai,et al.  Approximations and uncertainty measures in incomplete information systems , 2012, Inf. Sci..

[87]  Qingguo Li,et al.  Reduction about approximation spaces of covering generalized rough sets , 2010, Int. J. Approx. Reason..

[88]  William Zhu,et al.  Minimal Cost Attribute Reduction through Backtracking , 2011, FGIT-DTA/BSBT.

[89]  Wang Ju,et al.  Reduction algorithms based on discernibility matrix: The ordered attributes method , 2001, Journal of Computer Science and Technology.