Test-cost-sensitive attribute reduction based on neighborhood rough set

Recent research in machine learning and data mining has produced a wide variety of algorithms for cost-sensitive learning. Most existing rough set methods on this issue deal with nominal attributes. This is because that nominal attributes produce equivalent relations and therefore are easy to process. However, in real applications, datasets often contain numerical attributes. As we know, numerical attributes are more complex than nominal ones and require more computational resources. Consequently, respective learning tasks are more challenging. This paper deals with test-cost-sensitive attribute reduction for numerical valued decision systems. Neighborhood rough set achieved success in numerical data processing, hence we adopt the model to define the minimal test cost reduct problem. Due to the complexity of the new problem, heuristic algorithms are needed to find a sub-optimal solution. We propose one kind of heuristic information, which is the sum of the positive region and weighted test cost. When the test cost is not considered, the information degrades to the positive region, which is the most commonly used one in classical rough set. Three metrics are adopted to evaluate the performance of reduction algorithms from a statistical viewpoint. Experimental results show that the proposed method takes advantages of test costs and therefore produces satisfactory results.

[1]  Qing-Hua Hu,et al.  Numerical Attribute Reduction Based on Neighborhood Granulation and Rough Approximation: Numerical Attribute Reduction Based on Neighborhood Granulation and Rough Approximation , 2008 .

[2]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[3]  Wang Guo,et al.  Decision Table Reduction based on Conditional Information Entropy , 2002 .

[4]  Yuhua Qian,et al.  Test-cost-sensitive attribute reduction , 2011, Inf. Sci..

[5]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[6]  Qinghua Hu,et al.  Neighborhood classifiers , 2008, Expert Syst. Appl..

[7]  Yang Guo-wei An Efficient Knowledge Reduction Algorithm Based on New Conditional Information Entropy , 2005 .

[8]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[9]  Yiyu Yao,et al.  On Reduct Construction Algorithms , 2006, RSKT.

[10]  Fei-Yue Wang,et al.  Reduction and axiomization of covering generalized rough sets , 2003, Inf. Sci..

[11]  Dominik Slezak,et al.  Order Based Genetic Algorithms for the Search of Approximate Entropy Reducts , 2003, RSFDGrC.

[12]  Fan Min,et al.  Accumulated Cost Based Test-Cost-Sensitive Attribute Reduction , 2011, RSFDGrC.

[13]  Witold Pedrycz,et al.  Positive approximation: An accelerator for attribute reduction in rough set theory , 2010, Artif. Intell..

[14]  Hong Wang,et al.  Classification and reduction of attributes in concept lattices , 2006, 2006 IEEE International Conference on Granular Computing.

[15]  Yiyu Yao,et al.  Attribute reduction in decision-theoretic rough set models , 2008, Inf. Sci..

[16]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[17]  Jian Yu,et al.  Data structures in memory of Mobile RFID reader , 2011, 2011 International Conference on Machine Learning and Cybernetics.

[18]  Dominik Slezak,et al.  Approximate Entropy Reducts , 2002, Fundam. Informaticae.

[19]  Huang Qinghua,et al.  Numerical Attribute Reduction Based on Neighborhood Granulation and Rough Approximation , 2008 .

[20]  Ramasamy Uthurusamy,et al.  Data mining and knowledge discovery in databases , 1996, CACM.

[21]  Fan Min,et al.  A hierarchical model for test-cost-sensitive decision systems , 2009, Inf. Sci..

[22]  Qinghua Hu,et al.  Neighborhood rough set based heterogeneous feature subset selection , 2008, Inf. Sci..