Non-persistent stratified sampling based IQRA_IG for scalable reduct generation

Feature selections in large datasets using reduct based on rough set principles is computationally expensive. The existing scalable sampling-based reduct computation algorithms suffer from limitations like redundancy and inadequacy. This paper develops two algorithms for finding reduct based on stratified sampling and non-persistent stratified sampling techniques which addresses adequacy and to certain extent redundancy. This paper compares the performance of these algorithms against discernbility matrix-based sampling approximate reduct algorithm (SARA) and sample guided improved quick reduct algorithm with information gain heuristic (SGIQRA_IG). The performance of these algorithms is demonstrated on benchmark large dataset repository of Arizona State University.

[1]  Qiang Shen,et al.  Rough set-aided keyword reduction for text categorization , 2001, Appl. Artif. Intell..

[2]  Andrzej Skowron,et al.  Boolean Reasoning for Feature Extraction Problems , 1997, ISMIS.

[3]  Pai-Chou Wang,et al.  Highly Scalable Rough Set Reducts Generation , 2007, J. Inf. Sci. Eng..

[4]  Mohamed Quafafou,et al.  Scalable Feature Selection Using Rough Set Theory , 2000, Rough Sets and Current Trends in Computing.

[5]  Shifei Ding,et al.  Research and development of attribute reduction algorithm based on rough set , 2010, 2010 Chinese Control and Decision Conference.

[6]  C. Raghavendra Rao,et al.  IQuickReduct: An Improvement to Quick Reduct Algorithm , 2009, RSFDGrC.

[7]  P. M. Mujumdar,et al.  Ranking based uncertainty quantification for a multifidelity design approach , 2007 .

[8]  Ning Zhong,et al.  Using Rough Sets with Heuristics for Feature Selection , 1999, Journal of Intelligent Information Systems.

[9]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[10]  Yuchang Lu,et al.  Feature ranking in rough sets , 2003, AI Commun..

[11]  Dominik Slezak,et al.  Approximate Entropy Reducts , 2002, Fundam. Informaticae.

[12]  Henryk Rybinski,et al.  A New Approach to Distributed Algorithms for Reduct Calculation , 2008, Trans. Rough Sets.

[13]  Shi Zhong-zhi Novel Heuristic Algorithm for Knowledge Reduction , 2009 .

[14]  Wei-Zhi Wu,et al.  Approaches to knowledge reduction based on variable precision rough set model , 2004, Inf. Sci..

[15]  LU Yan-sheng Two New Reduction Definitions of Decision Table , 2006 .

[16]  Chu Jian Quick Attribute Reduction Algorithm with Hash , 2009 .