Multistage Email Spam Filtering Based on Three-Way Decisions

A ternary, three-way decision strategy to email spam filtering divides incoming emails into three folders, namely, a mail folder consisting of emails that we accept as being legitimate, a spam folder consisting of emails that we reject as being legitimate, and a third folder consisting of emails that we cannot accept nor reject based on available information. The introduction of the third folder enables us to reduce both acceptance and rejection errors. Many existing ternary approaches are essentially a single-stage process. In this paper, we propose a model of multistage three-way email spam filtering based on principles of granular computing and rough sets.

[1]  Georgios Paliouras,et al.  Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach , 2000, ArXiv.

[2]  Nouman Azam,et al.  Multiple Criteria Decision Analysis with Game-Theoretic Rough Sets , 2012, RSKT.

[3]  Zili Zhang,et al.  An email classification model based on rough set theory , 2005, Proceedings of the 2005 International Conference on Active Media Technology, 2005. (AMT 2005)..

[4]  Dun Liu,et al.  A Multiple-category Classification Approach with Decision-theoretic Rough Sets , 2012, Fundam. Informaticae.

[5]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[6]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[7]  Yiyu Yao,et al.  Sequential three-way decisions with probabilistic rough sets , 2011, IEEE 10th International Conference on Cognitive Informatics and Cognitive Computing (ICCI-CC'11).

[8]  Patrick Pantel,et al.  SpamCop: A Spam Classification & Organisation Program , 1998, AAAI 1998.

[9]  Yiyu Yao,et al.  Cost-sensitive three-way email spam filtering , 2013, Journal of Intelligent Information Systems.

[10]  Jerzy W. Grzymala-Busse,et al.  LERS-A System for Learning from Examples Based on Rough Sets , 1992, Intelligent Decision Support.

[11]  Huaxiong Li,et al.  Risk Decision Making Based on Decision-theoretic Rough Set: A Three-way View Decision Model , 2011, Int. J. Comput. Intell. Syst..

[12]  Yiyu Yao,et al.  Probabilistic rough set approximations , 2008, Int. J. Approx. Reason..

[13]  Gary Robinson,et al.  A statistical approach to the spam problem , 2003 .

[14]  Yiyu Yao,et al.  Granular Computing and Sequential Three-Way Decisions , 2013, RSKT.

[15]  Jiajun Chen,et al.  An Optimization Viewpoint of Decision-Theoretic Rough Set Model , 2011, RSKT.

[16]  Sadaaki Miyamoto,et al.  Rough Sets and Current Trends in Computing , 2012, Lecture Notes in Computer Science.

[17]  Witold Pedrycz,et al.  Granular Computing: Analysis and Design of Intelligent Systems , 2013 .

[18]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[19]  Yiyu Yao,et al.  A Multifaceted Analysis of Probabilistic Three-way Decisions , 2014, Fundam. Informaticae.

[20]  Yiyu Yao,et al.  An Outline of a Theory of Three-Way Decisions , 2012, RSCTC.

[21]  Wen-tau Yih Improving Spam Filtering by Detecting Gray Mail , 2007 .

[22]  Kan Zheng,et al.  Three-Way Decisions Solution to Filter Spam Email: An Empirical Study , 2012, RSCTC.

[23]  Roman Słowiński,et al.  Intelligent Decision Support , 1992, Theory and Decision Library.

[24]  Tianrui Li,et al.  THREE-WAY GOVERNMENT DECISION ANALYSIS WITH DECISION-THEORETIC ROUGH SETS , 2012 .

[25]  Yiyu Yao,et al.  Three-way decisions with probabilistic rough sets , 2010, Inf. Sci..

[26]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[27]  Hong Yu,et al.  Autonomous Knowledge-oriented Clustering Using Decision-Theoretic Rough Set Theory , 2010, Fundam. Informaticae.

[28]  Alek Kolcz,et al.  Improve Spam Filtering by Detecting Gray Mail , 2007, CEAS.

[29]  Jerzy W. Grzymala-Busse,et al.  A Local Version of the MLEM2 Algorithm for Rule Induction , 2010, Fundam. Informaticae.

[30]  Y. Yao Information granulation and rough set approximation , 2001 .