Web-Spam Features Selection Using CFS-PSO

Abstract This paper proposes Swarm based hybrid technique CFS-PSO, which combines the characteristics of Correlation Based Feature Selection Technique (CFS) and Particle Swarm Optimization (PSO) strategy. PSO is an optimization approach motivated by swarm conduct which uses the real-number randomness & the global communication among the swarm particles. Feature selection (pre-processing technique) is very crucial part of Data Mining & Machine Learning.The aims of feature selection includes building of simpler & more logical models and improving the performance in terms of reducing the time to build the learning model and increasing the accuracy. We assess the performance of CFS-PSO on WEBSPAM-UK2006 with five classifiers. Experimental results show reduction in original features & increasing the F-measure upto 88% & 45.83% respectively.