An Efficient feature selection algorithm for the spam email classification

The existing spam email classification systems are suffering from the problems of low accuracy due to the high dimensionality of the associated feature selection (FS) process. But being a global optimization process in machine learning, FS is mainly aimed at reducing the redundancy of dataset to create a set of acceptable and accurate results. This study presents the combination of Chaotic Particle Swarm Optimization (PSO) algorithm with Artificial Bees Colony (ABC) for the reduction of features dimensionality in a bid to improve spam emails classification accuracy. The features for each particle in this work were represented in a binary form, meaning that they were transformed into binary using a sigmoid function. The features selection was based on a fitness function that depended on the obtained accuracy using SVM. The proposed system was evaluated for performance by considering the performance of the classifier and the selected features vectors dimension which served as the input to the classifier; this evaluation was done using the Spam Base dataset and from the results, the PSO-ABC classifier performed well in terms of FS even with a small set of selected features.