Compact Rule Learner on Weighted Fuzzy Approximation Spaces for Class Imbalanced and Hybrid Data

Rough set theory is an efficient tool for machine learning and knowledge acquisition. By introducing weightiness into a fuzzy approximation space, a new rule induction algorithm is proposed, which combines three types of uncertainty: weightiness, fuzziness and roughness. We first define the key concepts of block, minimal complex and local covering in a weighted fuzzy approximation space, then a weighted fuzzy approximation space based rule learner, and finally a weighted certainty factor for evaluating fuzzy classification rules. The time complexity of proposed rule learner is theoretically analyzed. Furthermore, in order to estimate the performance of the proposed method on class imbalanced and hybrid datasets, we compare our method with classical methods by conducting experiments on fifteen datasets. Comparative studies indicate that rule sets extracted by this method get a better performance on minority class than other approaches. It is therefore concluded that the proposed rule learner is an effective method for class imbalanced and hybrid data learning.

[1]  D. Dubois,et al.  ROUGH FUZZY SETS AND FUZZY ROUGH SETS , 1990 .

[2]  Kai Ming Ting,et al.  An Instance-weighting Method to Induce Cost-sensitive Trees , 2001 .

[3]  M. Shaw,et al.  Induction of fuzzy decision trees , 1995 .

[4]  Jerzy W. Grzymala-Busse,et al.  LERS-A System for Learning from Examples Based on Rough Sets , 1992, Intelligent Decision Support.

[5]  Qinghua Hu,et al.  Fuzzy Probabilistic Approximation Spaces and Their Information Measures , 2006, IEEE Trans. Fuzzy Syst..

[6]  R. Słowiński Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory , 1992 .

[7]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[8]  Qinghua Hu,et al.  Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation , 2007, Pattern Recognit..

[9]  Rajen B. Bhatt,et al.  FRCT: fuzzy-rough classification trees , 2007, Pattern Analysis and Applications.

[10]  Jerzy W. Grzymala-Busse,et al.  Knowledge acquisition under uncertainty — a rough set approach , 1988, J. Intell. Robotic Syst..

[11]  Wei-Zhi Wu,et al.  Generalized fuzzy rough sets , 2003, Inf. Sci..

[12]  Qinghua Hu,et al.  A weighted rough set based method developed for class imbalance learning , 2008, Inf. Sci..

[13]  Andrzej Skowron,et al.  Rudiments of rough sets , 2007, Inf. Sci..

[14]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[15]  Qinghua Hu,et al.  Weighted Rough Set Learning: Towards a Subjective Approach , 2007, PAKDD.

[16]  James C. Bezdek,et al.  On cluster validity for the fuzzy c-means model , 1995, IEEE Trans. Fuzzy Syst..

[17]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[18]  Rajen B. Bhatt,et al.  On the compact computational domain of fuzzy-rough sets , 2005, Pattern Recognit. Lett..

[19]  Jerzy W. Grzymala-Busse,et al.  A New Version of the Rule Induction System LERS , 1997, Fundam. Informaticae.

[20]  Qiang Shen,et al.  Fuzzy-Rough Sets Assisted Attribute Selection , 2007, IEEE Transactions on Fuzzy Systems.

[21]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.