Combining Behavior Models to Secure Email Systems

We introduce the Email Mining Toolkit (EMT), a system that implements behavior-based methods to improve security of email systems. Behavior models of email flows and email account usage may be used for a variety of detection tasks. Behavior-based models are quite different from "content-based" models in common use today, such as virus scanners. We evaluate the soundness of these techniques for the detection of the onset of viral propagations. The results achieved for the detection of the onset of viral propagations suggest email delivery should be egress rate limited stored for a while and then forwarded or a record of recently delivered emails should be kept in order to develop sufficient statistics to verify a propagation is ongoing. EMT can form part of a larger security platform that deals with email security issues in general. We present the variety of EMT models implemented to date and suggest other security tasks that may benefit for its detection capabilities.

[1]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[2]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[3]  Stephanie Forrest,et al.  Email networks and the spread of computer viruses. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Coenraad Bron,et al.  Finding all cliques of an undirected graph , 1973 .

[5]  Salvatore J. Stolfo,et al.  USENIX Association Proceedings of the FREENIX Track : 2001 USENIX Annual , 2001 .

[6]  Salvatore J. Stolfo,et al.  Mining Audit Data to Build Intrusion Detection Models , 1998, KDD.

[7]  Philip K. Chan,et al.  Learning Patterns from Unix Process Execution Traces for Intrusion Detection , 1997 .

[8]  Eleazar Eskin,et al.  MET: an experimental system for Malicious Email Tracking , 2002, NSPW '02.

[9]  Christos Faloutsos,et al.  The "DGX" distribution for mining massive, skewed data , 2001, KDD '01.

[10]  A. Karr,et al.  Computer Intrusion: Detecting Masquerades , 2001 .

[11]  M Damashek,et al.  Gauging Similarity with n-Grams: Language-Independent Categorization of Text , 1995, Science.

[12]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[13]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[14]  Matthew M. Williamson,et al.  Throttling viruses: restricting propagation to defeat malicious mobile code , 2002, 18th Annual Computer Security Applications Conference, 2002. Proceedings..

[15]  F. Downton,et al.  Introduction to Mathematical Statistics , 1959 .

[16]  Eleazar Eskin,et al.  A GEOMETRIC FRAMEWORK FOR UNSUPERVISED ANOMALY DETECTION: DETECTING INTRUSIONS IN UNLABELED DATA , 2002 .