Quantitative Analysis of Efficient Antispam Techniques

While dynamic content-based filtering mechanisms for the identification of unsolicited commercial email (UCE, or more commonly "spam") have proven to be effective, these techniques require considerable computational resources. It is therefore highly desirable to reduce the number of emails that must be subjected to a content-based analysis. In this paper, a number of efficient techniques based on lower protocol level properties are analyzed using a large real-world data set. We show that combinations of several network-based filters can provide a computationally efficient pre-filtering mechanism at acceptable false-positive rates

[1]  J. Piatt,et al.  Receiver-operating characteristic curves. , 2001, Journal of neurosurgery.

[2]  Virgílio A. F. Almeida,et al.  Characterizing a spam traffic , 2004, IMC '04.

[3]  Guido Schryen A Formal Approach towards Assessing the Effectiveness of Anti-Spam Procedures , 2006, Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06).

[4]  Virgílio A. F. Almeida,et al.  Comparative Graph Theoretical Characterization of Networks of Spam , 2005, CEAS.

[5]  Ernesto Damiani,et al.  An Open Digest-based Technique for Spam Detection , 2004, PDCS.

[6]  Mikko T. Siponen,et al.  Effective Anti-Spam Strategies in Companies: An International Study , 2006, Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06).

[7]  Emil Sit,et al.  An empirical study of spam traffic and the use of DNS black lists , 2004, IMC '04.

[8]  José María Gómez Hidalgo,et al.  Evaluating cost-sensitive Unsolicited Bulk Email categorization , 2002, SAC '02.

[9]  Peter J. Denning,et al.  Electronic Junk , 1982, Commun. ACM.

[10]  Jeff Balvanz,et al.  Spam software evaluation, training, and support: fighting back to reclaim the email inbox , 2004, SIGUCCS '04.

[11]  B. Turnbull,et al.  NONPARAMETRIC AND SEMIPARAMETRIC ESTIMATION OF THE RECEIVER OPERATING CHARACTERISTIC CURVE , 1996 .

[12]  Peter J. Denning,et al.  ACM president's letter: electronic junk , 1982, CACM.

[13]  Tianshun Yao,et al.  An evaluation of statistical spam filtering techniques , 2004, TALIP.

[14]  P. Deepak,et al.  Spam filtering using spam mail communities , 2005, The 2005 Symposium on Applications and the Internet.

[15]  Viv Bewick,et al.  Statistics review 13: Receiver operating characteristic curves , 2004, Critical care.

[16]  S. Bornholdt,et al.  Scale-free topology of e-mail networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Vipul Ved Prakash,et al.  Fighting Spam with Reputation Systems , 2005, ACM Queue.

[18]  Alois Potton Spam , 2003, PIK Prax. Informationsverarbeitung Kommun..