SOAP: A Social network Aided Personalized and effective spam filter to clean your e-mail box

The explosive growth of unsolicited emails has prompted the development of numerous spam filtering techniques. A Bayesian spam filter is superior to a static keywordbased spam filter because it can continuously evolve to tackle new spam by learning keywords in new spam emails. However, Bayesian spam filters can be easily poisoned by avoiding spam keywords and adding many innocuous keywords in the emails. In addition, they need a significant amount of time to adapt to a new spam based on user feedback. Moreover, few current spam filters exploit social networks to assist spam detection. In order to develop an accurate and user-friendly spam filter, in this paper, we propose a SOcial network Aided Personalized and effective spam filter (SOAP). Unlike previous filters that focus on parsing keywords (e.g, Bayesian filter) or building blacklists, SOAP exploits the social relationship among email correspondents to detect the spam adaptively and automatically. SOAP integrates three components into the basic Bayesian filter: social closeness-based spam filtering, social interest-based spam filtering, and adaptive trust management. We evaluate performance of SOAP based on the trace data from Facebook. Experimental results show that SOAP can greatly improve the performance of Bayesian spam filters in terms of the accuracy, attack-resilience and efficiency of spam detection. We also find that the performance of Bayesian spam filters is the lower bound of SOAP.

[1]  Ivan Bratko,et al.  Information-Based Evaluation Criterion for Classifier's Performance , 1991, Machine Learning.

[2]  Igor Kononenko,et al.  Semi-Naive Bayesian Classifier , 1991, EWSL.

[3]  P. Oscar Boykin,et al.  Personal Email Networks: An Effective Anti-Spam Tool , 2004, ArXiv.

[4]  P. Oscar Boykin,et al.  Leveraging social networks to fight spam , 2005, Computer.

[5]  Juan M. Corchado,et al.  SpamHunting: An instance-based reasoning system for spam labelling and filtering , 2007, Decis. Support Syst..

[6]  Zili Zhang,et al.  An email classification model based on rough set theory , 2005, Proceedings of the 2005 International Conference on Active Media Technology, 2005. (AMT 2005)..

[7]  David Mazières,et al.  RE: Reliable Email , 2006, NSDI.

[8]  T. Tabata,et al.  Design and Evaluation of a Bayesian-filter-based Image Spam Filtering Method , 2008, 2008 International Conference on Information Security and Assurance (isa 2008).

[9]  Jim Kurose,et al.  Computer Networking: A Top-Down Approach (6th Edition) , 2007 .

[10]  Y. Sinai,et al.  Theory of probability and random processes , 2007 .

[11]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[12]  Padraig Cunningham,et al.  An Assessment of Case-Based Reasoning for Spam Filtering , 2005, Artificial Intelligence Review.

[13]  Lluís Màrquez i Villodre,et al.  Boosting Trees for Anti-Spam Email Filtering , 2001, ArXiv.

[14]  Ben Y. Zhao,et al.  Approximate Object Location and Spam Filtering on Peer-to-Peer Systems , 2003, Middleware.

[15]  P. Oscar Boykin,et al.  Collaborative Spam Filtering Using E-Mail Networks , 2006, Computer.

[16]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[17]  James A. Hendler,et al.  Reputation Network Analysis for Email Filtering , 2004, CEAS.

[18]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[19]  F. Heider ATTITUDES AND COGNITIVE ORGANIZATION , 1977 .

[20]  Peter Haider,et al.  Supervised clustering of streaming data for email batch detection , 2007, ICML '07.

[21]  Steffen Bickel,et al.  Dirichlet-Enhanced Spam Filtering based on Biased Samples , 2006, NIPS.

[22]  Hector J. Levesque,et al.  Knowledge Representation and Reasoning , 2004 .

[23]  Gayatri Swamynathan,et al.  Do social networks improve e-commerce?: a study on social marketplaces , 2008, WOSN '08.

[24]  Ben Y. Zhao,et al.  User interactions in social networks and their implications , 2009, EuroSys '09.

[25]  Sufian Hameed,et al.  LENS : LEveraging anti-social Networking against Spam , 2010 .