A decision-theoretic rough set approach to spam filtering

Spam filtering is a research hotspot of information security. For the weak fault-tolerant ability of traditional filtering methods, an approach to spam filtering based on α - positive-region of decision-theoretic rough set (DTRS) is developed. Firstly, α -positive-region attribute reduction theorem is adopted to reduce email attributes. Then, according to the minimum risk Bayesian decision theory, a three-way decision, named spam, doubt and non-spam, is realized by depicting the undecided emails using boundary region of DTRS. The simulation results show that this approach is effective and helpful to improve the performance of spam filtering.

[1]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[2]  Yiyu Yao,et al.  A Three-Way Decision Approach to Email Spam Filtering , 2010, Canadian Conference on AI.

[3]  Wei-bin Deng,et al.  Double-stage spam filtering method based on rough set: Double-stage spam filtering method based on rough set , 2010 .

[4]  Dun Liu,et al.  Attribute Reduction in Decision-Theoretic Rough Set Model: A Further Investigation , 2011, RSKT.

[5]  Walmir M. Caminhas,et al.  A review of machine learning approaches to Spam filtering , 2009, Expert Syst. Appl..

[6]  LI Tian-rui Knowledge Acquisition from Decision Tables Containing Continuous-Valued Attributes , 2009 .

[7]  Yiyu Yao,et al.  Attribute reduction in decision-theoretic rough set models , 2008, Inf. Sci..

[8]  Deng Wei-bin Double-stage spam filtering method based on rough set , 2010 .

[9]  Chih-Chin Lai,et al.  An empirical study of three machine learning methods for spam filtering , 2007, Knowl. Based Syst..

[10]  Aleksander Ohrn,et al.  ROSETTA -- A Rough Set Toolkit for Analysis of Data , 1997 .

[11]  Ye Qiu,et al.  A method of spam filtering based on weighted support vector machines , 2009, 2009 IEEE International Symposium on IT in Medicine & Education.

[12]  Yiyu Yao,et al.  Three-way decisions with probabilistic rough sets , 2010, Inf. Sci..

[13]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[14]  Jangbok Kim,et al.  Spam Filtering With Dynamically Updated URL Statistics , 2007, IEEE Security & Privacy.

[15]  Li Zhi An E-mail Classification System Based on Rough Set , 2004 .

[16]  Gao Wei Information filtering model based on decision-theoretic rough set theory , 2007 .