A DKIM based Architecture for Combating Good Word Attack in Statistical Spam Filters

Kashefa Kowser.K, Saruladha.K, Packiavathy.M Abstract— Abuse of E-Mail by unwanted users causes an exponential increase of E-Mails in user mailboxes which is known as Spam. It is an unsolicited commercial E-mail or unsolicited bulk E-Mail produces huge economic loss to large scale organizations due to high network bandwidth consumption and heavy mail server processing overload. Statistical spam filters could be used to categorize incoming E-Mails into legitimate and spam but they are vulnerable to Good Word attack which obfuscates “good words” in spam messages to make it legitimate. This paper attempts for a counterattack strategy to eradicate insertion of good words by proposing architecture of enhanced DKIM (DomainKeys Identified Mail) as a solution. Our experimental result shows that DKIM serves to be the best as it incorporates sender evidence with random values in the E-Mail messages which is critical for the spammers to evade E-Mail filtering process. The misclassification of the spam E-Mail as legitimate E-Mail would reduce the performance of text classifiers. As the number of E-Mail increases, the misclassification percentage decreases by using DKIM