Query or spam: Detecting fraudulent web requests using stream clustering

Today, we are surrounded by a huge amount of data in the Internet, especially in the World Wide Web. Search engines are developed as an important tool to facilitate the access of user to the requested information. However, a spammer tries to exploit these engines and make them work the way he wants by sending spam queries. So, detecting these spam queries which are usually sent by botnets is of great importance. In this paper, we propose a method based on a semi-supervised stream clustering algorithm which analyzes the activity log of users based on their sessions and identifies such spammers. To evaluate the method, we have used k-fold cross validation which resulted in a satisfactory accuracy.