Anti-spam filtering: a centroid-based classification approach

Nowadays, electronic mail is the most popular and convenient way for communication in daily life. Spam or junk E-mails are also increasingly appearing in the mail box from commercial Web sites. Therefore, we investigated the way to filter these junk e-mails through a variety of techniques, i.e. naive Bayesian, k-nearest neighbor and centroid based approach. We found that the centroid-based approach is the most suitable for the mail filtering application with 83.00 % of correctness. The outcome of our research has been successfully implemented as an intelligent Web mail service with the anti-spam mail filtering feature plug-in.