Detecting and filtering instant messaging spam - a global and personalized approach

While instant message (IM) is gaining its popularity it is exposed to increasingly severe security threats. A serious problem is IM spam (spim) that is unsolicited commercial messages sent via IM messengers. Unlike e-mail spam (unsolicited bulk e-mails), which has been a serious security issue for a long time and a number of techniques have been proposed to cope with, spim has not received adequate attention from the research community yet, and traditional spam filtering techniques are not directly applicable to spim due to its presence information and real time nature. In this paper, we present a new architecture for detecting and filtering spim. With the unique infrastructure of IM systems spim detection and filtering can be achieved not only at the client (receiver) side - for a personalized filtering - but also at the server side and various IM gateways - for a global filtering. Our technique integrates a number of mature spam defending techniques with modifications for IM applications, such as Black/White List, collaborative feedback based filtering, content-based technique, and challenge-response based filtering. We also design and implement new techniques for efficient spim detection and filtering, including filtering methods based on IM sending rate, content based spim defending techniques, fingerprint vector based filtering, text comparison filtering, and Bayesian filtering. We provide an analysis of their performances based on experimental results.

[1]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[2]  M. Debbabi,et al.  The war of presence and instant messaging: right protocols and APIs , 2004, First IEEE Consumer Communications and Networking Conference, 2004. CCNC 2004..

[3]  Ernesto Damiani,et al.  P2P-based collaborative spam detection and filtering , 2004, Proceedings. Fourth International Conference on Peer-to-Peer Computing, 2004. Proceedings..

[4]  Udi Manber,et al.  Finding Similar Files in a Large File System , 1994, USENIX Winter.

[5]  Ben Y. Zhao,et al.  Approximate Object Location and Spam Filtering on Peer-to-Peer Systems , 2003, Middleware.

[6]  Christian Huitema,et al.  Session Initiation Protocol (SIP) Extension for Instant Messaging , 2002, RFC.

[7]  Peter Saint-Andre,et al.  Extensible Messaging and Presence Protocol (XMPP): Instant Messaging and Presence , 2004, RFC.