Spam detection using KNN and decision tree mechanism in social network

Social network (SN) is an online platform extensively used as communication tool by millions of society in order to built social relation with others for career purposes, knowledge point of view, politics and many more. In today's time everyone is online which make it today's most vast network of information. Different SN applications are available like Twitter, Facebook and MySpace through which peoples can communicate with other and send text, audio and video messages. During communication it is possible that a user can performs unwanted activities and send spam messages to disturb communication process. It is difficult to detect these kinds of spam messages. In this paper spam detection mechanism based on decision tree and KNN algorithm has been proposed. In proposed mechanism we apply these algorithms on real datasets of Twitter to detect spam messages. To analyse proposed mechanism Weka tool is used. The performance metrics like TP Rate, FP Rate, Precision, Recall, F-Measure and Class are used to measure the execution of proposed mechanism.

[1]  Abdolreza Abhari,et al.  Cluster-discovery of Twitter messages for event detection and trending , 2015, J. Comput. Sci..

[2]  Liang Chen,et al.  Unveil the Spams in Weibo , 2013, 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing.

[3]  Chao Yang,et al.  CATS: Characterizing automation of Twitter spammers , 2013, 2013 Fifth International Conference on Communication Systems and Networks (COMSNETS).

[4]  Rashedur M. Rahman,et al.  A data mining based spam detection system for YouTube , 2013, Eighth International Conference on Digital Information Management (ICDIM 2013).

[5]  Louis Lei Yu,et al.  An Evaluation of the Effect of Spam on Twitter Trending Topics , 2013, 2013 International Conference on Social Computing.

[6]  Chao Yang,et al.  Empirical Evaluation and New Design for Fighting Evolving Twitter Spammers , 2011, IEEE Transactions on Information Forensics and Security.

[7]  Saini Jacob Soman,et al.  Detecting malicious tweets in trending topics using clustering and classification , 2014, 2014 International Conference on Recent Trends in Information Technology.

[8]  Su He,et al.  Identifying user behavior on Twitter based on multi-scale entropy , 2014, Proceedings 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics (SPAC).

[9]  Saini Jacob Soman,et al.  Bayesian probabilistic tensor factorization for malicious tweets in trending topics , 2014, 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT).

[10]  Alice Oh,et al.  Analysis of Twitter Lists as a Potential Source for Discovering Latent Characteristics of Users , 2010 .

[11]  Vadlamani Ravi,et al.  Soft computing based imputation and hybrid data and text mining: The case of predicting the severity of phishing alerts , 2012, Expert Syst. Appl..

[12]  Harry Wechsler,et al.  phishGILLNET—phishing detection methodology using probabilistic latent semantic analysis, AdaBoost, and co-training , 2012 .