Fake Account Detection in Twitter Based on Minimum Weighted Feature set

Social networking sites such as Twitter and Facebook attracts over 500 million users across the world, for those users, their social life, even their practical life, has become interrelated. Their interaction with social networking has affected their life forever. Accordingly, social networking sites have become among the main channels that are responsible for vast dissemination of different kinds of information during real time events. This popularity in Social networking has led to different problems including the possibility of exposing incorrect information to their users through fake accounts which results to the spread of malicious content during life events. This situation can result to a huge damage in the real world to the society in general including citizens, business entities, and others. In this paper, we present a classification method for detecting the fake accounts on Twitter. The study determines the minimized set of the main factors that influence the detection of the fake accounts on Twitter, and then the determined factors are applied using different classification techniques. A comparison of the results of these techniques has been performed and the most accurate algorithm is selected according to the accuracy of the results. The study has been compared with different recent researches in the same area; this comparison has proved the accuracy of the proposed study. We claim that this study can be continuously applied on Twitter social network to automatically detect the fake accounts; moreover, the study can be applied on different social network sites such as Facebook with minor changes according to the nature of the social network which are discussed in this paper. Keywords—Fake accounts detection, classification algorithms, twitter accounts analysis, features based techniques.

[1]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[2]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[3]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[4]  Konstantin Beznosov,et al.  Integro: Leveraging Victim Prediction for Robust Fake Account Detection in OSNs , 2015, NDSS.

[5]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[6]  A. Karegowda,et al.  COMPARATIVE STUDY OF ATTRIBUTE SELECTION USING GAIN RATIO AND CORRELATION BASED FEATURE SELECTION , 2010 .

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  Tatsunori Mori,et al.  Term Weighting Method based on Information Gain Ratio for Summarizing Documents Retrieved by IR Systems , 2001, NTCIR.

[9]  P. Kumaraguru,et al.  $1.00 per RT #BostonMarathon #PrayForBoston: Analyzing fake content on Twitter , 2013, 2013 APWG eCrime Researchers Summit.

[10]  Ben Y. Zhao,et al.  Uncovering social network sybils in the wild , 2011, IMC '11.

[11]  Gianluca Stringhini,et al.  Detecting spammers on social networks , 2010, ACSAC '10.

[12]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[13]  Jiawei Han,et al.  Evaluating Event Credibility on Twitter , 2012, SDM.

[14]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[15]  Michael Sirivianos,et al.  Aiding the Detection of Fake Accounts in Large Scale Social Online Services , 2012, NSDI.

[16]  Vidyasagar Potdar Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference , 2011 .

[17]  Roberto Di Pietro,et al.  A Criticism to Society (As Seen by Twitter Analytics) , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems Workshops (ICDCSW).

[18]  Georgia Koutrika,et al.  Fighting Spam on Social Web Sites: A Survey of Approaches and Future Challenges , 2007, IEEE Internet Computing.

[19]  DETECTING SUBVERSION ON TWITTER , 2014 .

[20]  Jeanna Neefe Matthews,et al.  Fake Twitter accounts: profile characteristics obtained using an activity-based pattern detection approach , 2015, SMSociety.