Semantic spam filtering from personalized ontologies

One of the biggest problems that Internet faces is the increase of email spam. The main drawback with previous anti-spam filters is that they are based only on 1) the syntactical features of words lacking semantic analysis, or 2) on what the majority of users regard as spam without considering the individual preferences of a particular user. In this paper we present a spam email filter that personalizes its filtering process using an email user profile that contains the user's preferences regarding emails. Our innovative email user profile is based not only on some common user profiling techniques but also on the knowledge contained in a domain ontology. The user profile is used to learn which spam emails (although unsolicited and large-scale sent) are interesting for the user, despite they are spam. The encouraging experimental results provide empirical evidence of the effectiveness of using an ontological approach to user profiling in an email spam filter.

[1]  Hai Zhuge,et al.  Inheritance rules for flexible model retrieval , 1998, Decis. Support Syst..

[2]  Hai Zhuge,et al.  A knowledge grid model and platform for global knowledge sharing , 2002, Expert Syst. Appl..

[3]  Min Song,et al.  Semantic Query Expansion Combining Association Rules with Ontologies and Information Retrieval Techniques , 2005, DaWaK.

[4]  Xiaoming Chen,et al.  Using an Interest Ontology for Improved Support in Rule Mining , 2003, DaWaK.

[5]  Wenbin Li,et al.  ECPIA: An Email-Centric Personal Intelligent Assistant , 2006, RSKT.

[6]  John D. Garofalakis,et al.  An Integrated Technique for Web Site Usage Semantic Analysis: the Organ System , 2007, J. Web Eng..

[7]  C. L. Giles,et al.  Inserting rules into recurrent neural networks , 1992, Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop.

[8]  Analía Amandi,et al.  User profiling for Web page filtering , 2005, IEEE Internet Computing.

[9]  Peter G. Politakis,et al.  Empirical analysis for expert systems , 1985 .

[10]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[11]  Georgia Koutrika,et al.  Fighting Spam on Social Web Sites: A Survey of Approaches and Future Challenges , 2007, IEEE Internet Computing.

[12]  Hai Zhuge,et al.  Theory and Algorithm for Rule Base Refinement , 2003, IEA/AIE.

[13]  Éric Grégoire,et al.  Checking depth‐limited consistency and inconsistency in knowledge‐based systems , 2001 .

[14]  C. L. Giles,et al.  Rule refinement with recurrent neural networks , 1993, IEEE International Conference on Neural Networks.

[15]  Michael J. Pazzani,et al.  Learning and Revising User Profiles: The Identification of Interesting Web Sites , 1997, Machine Learning.

[16]  Laks V. S. Lakshmanan,et al.  Interestingness and Pruning of Mined Patterns , 1999, 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[17]  Dejing Dou,et al.  Constructing a User Preference Ontology for Anti-spam Mail Systems , 2007, Canadian Conference on AI.

[18]  Bin Shen,et al.  Ontology-based Association Rules Retrieval using Protege Tools , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[19]  Rainer Knauf,et al.  Towards validation and refinement of rule-based systems , 2000, J. Exp. Theor. Artif. Intell..

[20]  Rainer Knauf,et al.  System Refinement in Practice - Using a Formal Method to Modify Real-Life Knowledge , 2002, FLAIRS.

[21]  Volker Tresp,et al.  Network Structuring and Training Using Rule-Based Knowledge , 1992, NIPS.

[22]  Hans-Werner Kelbassa Context Refinement - Investigating the Rule Refinement Completeness of SEEK/SEEK2 , 2002, ECAI.

[23]  Kang Li,et al.  Towards an Ontology Driven Spam Filter , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[24]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[25]  Allen Ginsberg,et al.  Theory Reduction, Theory Revision, and Retranslation , 1990, AAAI.

[26]  Dennis McLeod,et al.  Efficient Spam Email Filtering using Adaptive Ontology , 2007, Fourth International Conference on Information Technology (ITNG'07).

[27]  Allen Ginsberg,et al.  Automatic Refinement of Expert System Knowledge Bases , 1988 .

[28]  Ian Witten,et al.  Data Mining , 2000 .

[29]  Rainer Knauf,et al.  The Rule Retranslation Problem and the Validation Interface , 2003, FLAIRS.

[30]  Rainer Knauf,et al.  A framework for validation of rule-based systems , 2002, IEEE Trans. Syst. Man Cybern. Part B.