Detection of Spammers in Twitter marketing: A Hybrid Approach Using Social Media Analytics and Bio Inspired Computing

Customer engagement is drastically improved through Web 2.0 technologies, especially social media platforms like Twitter. These platforms are often used by organizations for marketing, of which creation of numerous spam profiles for content promotion is common. The present paper proposes a hybrid approach for identifying the spam profiles by combining social media analytics and bio inspired computing. It adopts a modified K-Means integrated Levy flight Firefly Algorithm (LFA) with chaotic maps as an extension to Firefly Algorithm (FA) for spam detection in Twitter marketing. A total of 18,44,701 tweets have been analyzed from 14,235 Twitter profiles on 13 statistically significant factors derived from social media analytics. A Fuzzy C-Means Clustering approach is further used to identify the overlapping users in two clusters of spammers and non-spammers. Six variants of K-Means integrated FA including chaotic maps and levy flights are tested. The findings indicate that FA with chaos for tuning attractiveness coefficient using Gauss Map converges to a working solution the fastest. Further, LFA with chaos for tuning the absorption coefficient using sinusoidal map outperforms the rest of the approaches in terms of accuracy.

[1]  Phil Bradley,et al.  Be where the conversations are: The critical importance of social media , 2010 .

[2]  Mesut Çiçek,et al.  The Impact of Social Media Marketing on Brand Loyalty , 2012 .

[3]  Ciro Cattuto,et al.  Social spam detection , 2009, AIRWeb '09.

[4]  K. Goh,et al.  Social Media Brand Community and Consumer Behavior: Quantifying the Relative Impact of User- and Marketer-Generated Content , 2013 .

[5]  Janez Brest,et al.  A comprehensive review of firefly algorithms , 2013, Swarm Evol. Comput..

[6]  Roberto Di Pietro,et al.  Fame for sale: Efficient detection of fake Twitter followers , 2015, Decis. Support Syst..

[7]  Yogesh Kumar Dwivedi,et al.  Social Media in the Marketing Context: A State of the Art Analysis and Future Directions , 2016 .

[8]  Arkaitz Zubiaga,et al.  Making the Most of Tweet-Inherent Features for Social Spam Detection on Twitter , 2015, #MSM.

[9]  Yogesh Kumar Dwivedi,et al.  Search engine marketing is not all gold: Insights from Twitter and SEOClerks , 2018, Int. J. Inf. Manag..

[10]  B. Chae,et al.  Insights from hashtag #supplychain and Twitter Analytics: Considering Twitter and Twitter data for supply chain practice and research , 2015 .

[11]  William F. Lewis,et al.  The Evolution (Revolution) of Social Media and Social Networking as a Necessary Topic in the Marketing Curriculum: A Case for Integrating Social Media into Marketing Classes , 2010 .

[12]  Sergios Dimitriadis,et al.  Brand strategies in social media , 2014 .

[13]  Jure Leskovec,et al.  Information diffusion and external influence in networks , 2012, KDD.

[14]  Leandro dos Santos Coelho,et al.  Use of chaotic sequences in a biologically inspired algorithm for engineering design optimization , 2008, Expert Syst. Appl..

[15]  Aaron W. Baur Harnessing the social web to enhance insights into people’s opinions in business, government and public administration , 2017, Inf. Syst. Frontiers.

[16]  Xin-She Yang,et al.  Firefly Algorithms for Multimodal Optimization , 2009, SAGA.

[17]  S. P. Ghrera,et al.  Outlier Detection Among Influencer Blogs Based on off-Site Web Analytics Data , 2017, I3E.

[18]  U. Gretzel,et al.  Role of social media in online travel information search , 2010 .

[19]  Surya Prakash Singh,et al.  Integrating big data analytic and hybrid firefly-chaotic simulated annealing approach for facility layout problem , 2018, Ann. Oper. Res..

[20]  Arpan Kumar Kar,et al.  Big Data Analytics: A Review on Theoretical Contributions and Tools Used in Literature , 2017, Global Journal of Flexible Systems Management.

[21]  Alex Hai Wang,et al.  Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach , 2010, DBSec.

[22]  Georg Lausen,et al.  Propagation Models for Trust and Distrust in Social Networks , 2005, Inf. Syst. Frontiers.

[23]  Fang Wang,et al.  Firm web visibility and its business value , 2014, Internet Res..

[24]  Sushil Jajodia,et al.  Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg? , 2012, IEEE Transactions on Dependable and Secure Computing.

[25]  Kyumin Lee,et al.  Seven Months with the Devils: A Long-Term Study of Content Polluters on Twitter , 2011, ICWSM.

[26]  Xin-She Yang,et al.  Firefly Algorithm, Lévy Flights and Global Optimization , 2010, SGAI Conf..

[27]  Lon Safko,et al.  The Social Media Bible: Tactics, Tools, and Strategies for Business Success , 2009 .

[28]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[29]  Igor Santos,et al.  Twitter Content-Based Spam Filtering , 2013, SOCO-CISIS-ICEUTE.

[30]  D. Pitta,et al.  Social media's emerging importance in market research , 2012 .

[31]  P. Leeflang,et al.  Popularity of Brand Posts on Brand Fan Pages: An Investigation of the Effects of Social Media Marketing , 2012 .

[32]  Weili Wu,et al.  Maximizing rumor containment in social networks with constrained time , 2014, Social Network Analysis and Mining.

[33]  Rebecca Walker Naylor,et al.  Beyond the “Like” Button: The Impact of Mere Virtual Presence on Brand Evaluations and Purchase Intentions in Social Media Settings , 2012 .

[34]  André Freitas,et al.  The Visibility of the Self on the Web: A Struggle forRecognition , 2011 .

[35]  Mahesh S. Raisinghani,et al.  The contributing factors of continuance usage of social media: An empirical analysis , 2018, Inf. Syst. Frontiers.

[36]  P. Berthon,et al.  Marketing meets Web 2.0, social media, and creative consumers: Implications for international marketing strategy , 2012 .

[37]  David J. Faulds,et al.  Social media: The new hybrid element of the promotion mix , 2009 .

[38]  Ray Qing Cao,et al.  Using sentiment analysis to improve supply chain intelligence , 2017, Information Systems Frontiers.

[39]  S. Ross The Economic Theory of Agency: The Principal's Problem , 1973 .

[40]  Xin-She Yang,et al.  Firefly algorithm, stochastic test functions and design optimisation , 2010, Int. J. Bio Inspired Comput..

[41]  Anandhi Bharadwaj,et al.  An Empirical Analysis of Contract Structures in IT Outsourcing , 2009, Inf. Syst. Res..

[42]  A. Kaplan,et al.  Users of the world, unite! The challenges and opportunities of Social Media , 2010 .

[43]  Andrew Lipsman,et al.  The Power of “Like” , 2012, Journal of Advertising Research.

[44]  P. Gillin,et al.  The New Influencers: A Marketer's Guide to the New Social Media , 2007 .

[45]  C. Hanson,et al.  Enhancing Promotional Strategies Within Social Marketing Programs: Use of Web 2.0 Social Media , 2008, Health promotion practice.

[46]  Lada A. Adamic,et al.  The role of social networks in information diffusion , 2012, WWW.

[47]  P. Vigneswara Ilavarasan,et al.  Review of Discussions on Internet of Things (IoT): Insights from Twitter Analytics , 2017, J. Glob. Inf. Manag..

[48]  Arpan Kumar Kar,et al.  Swarm Intelligence: A Review of Algorithms , 2017 .

[49]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[50]  Daniel M. Romero,et al.  Influence and Passivity in Social Media , 2011, ECML/PKDD.

[51]  Marcel Rosenberger,et al.  Integrating data from user activities of social networks into public administrations , 2016, Information Systems Frontiers.

[52]  Athanasios V. Vasilakos,et al.  Revealing the efficiency of information diffusion in online social networks of microblog , 2015, Inf. Sci..

[53]  Clay Shirky The political power of social media: Technology, the public sphere, and political change , 2011 .

[54]  Arun Vishwanath,et al.  Diffusion of deception in social media: Social contagion effects and its antecedents , 2014, Information Systems Frontiers.

[55]  J. N. Blom,et al.  Click bait: Forward-reference as lure in online news headlines , 2015 .

[56]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[57]  Eun Go,et al.  But not all social media are the same: Analyzing organizations' social media usage patterns , 2016, Telematics Informatics.

[58]  Angella J. Kim,et al.  Do social media marketing activities enhance customer equity? An empirical study of luxury fashion brand , 2012 .

[59]  Kim Christian Schrøder,et al.  The Relative Importance of Social Media for Accessing, Finding, and Engaging with News , 2014 .

[60]  John Gallaugher,et al.  Social Media and Customer Dialog Management at Starbucks , 2010, MIS Q. Executive.

[61]  Xin-She Yang,et al.  Firefly algorithm with chaos , 2013, Commun. Nonlinear Sci. Numer. Simul..

[62]  Kevin Borders,et al.  Social networks and context-aware spam , 2008, CSCW.

[63]  R. Mantegna,et al.  Fast, accurate algorithm for numerical simulation of Lévy stable stochastic processes. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[64]  J. Bezdek,et al.  FCM: The fuzzy c-means clustering algorithm , 1984 .

[65]  Robert Winter,et al.  A viability theory for digital businesses: Exploring the evolutionary changes of revenue mechanisms to support managerial decisions , 2016, Information Systems Frontiers.

[66]  R. Dellavalle,et al.  Social internet sites as a source of public health information. , 2009, Dermatologic clinics.

[67]  G. S. O'Keeffe,et al.  The Impact of Social Media on Children, Adolescents, and Families , 2011, Pediatrics.

[68]  Tadhg Nagle,et al.  Understanding social media business value, a prerequisite for social media selection , 2013, J. Decis. Syst..

[69]  Wolfgang Nejdl,et al.  How valuable is medical social media data? Content analysis of the medical web , 2009, Inf. Sci..

[70]  Jong Kim,et al.  Spam Filtering in Twitter Using Sender-Receiver Relationship , 2011, RAID.

[71]  Jiebo Luo,et al.  SocialSpamGuard: A Data Mining-Based Spam Detection System for Social Media Networks , 2011, Proc. VLDB Endow..

[72]  Yuval Elovici,et al.  Friend or foe? Fake profile identification in online social networks , 2013, Social Network Analysis and Mining.

[73]  Xianchao Zhang,et al.  Detecting Spam and Promoting Campaigns in the Twitter Social Network , 2012, 2012 IEEE 12th International Conference on Data Mining.

[74]  Simon Fong,et al.  Integrating nature-inspired optimization algorithms to K-means clustering , 2012, Seventh International Conference on Digital Information Management (ICDIM 2012).

[75]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[76]  Danah Boyd,et al.  Detecting Spam in a Twitter Network , 2009, First Monday.

[77]  P. Vigneswara Ilavarasan,et al.  Exploring Content Virality in Facebook: A Semantic Based Approach , 2017, I3E.

[78]  Kyumin Lee,et al.  Uncovering social spammers: social honeypots + machine learning , 2010, SIGIR.

[79]  Gaganmeet Kaur Awal,et al.  Leveraging collective intelligence for behavioral prediction in signed social networks through evolutionary approach , 2017, Information Systems Frontiers.

[80]  Bruno S. Silvestre,et al.  Social Media? Get Serious! Understanding the Functional Building Blocks of Social Media , 2011 .

[81]  Xin-She Yang 17. Firefly Algorithm , 2010 .

[82]  Sajid Si,et al.  Social Media and Its Role in Marketing , 2015 .

[83]  Slawomir Zak,et al.  Firefly Algorithm for Continuous Constrained Optimization Tasks , 2009, ICCCI.

[84]  V. Mani,et al.  Clustering using firefly algorithm: Performance study , 2011, Swarm Evol. Comput..

[85]  Homero Gil de Zúñiga,et al.  Social Media Use for News and Individuals' Social Capital, Civic Engagement and Political Participation , 2012, J. Comput. Mediat. Commun..

[86]  Arpan Kumar Kar,et al.  Bio inspired computing - A review of algorithms and scope of applications , 2016, Expert Syst. Appl..

[87]  Calton Pu,et al.  Social Honeypots: Making Friends With A Spammer Near You , 2008, CEAS.

[88]  S. P. Ghrera,et al.  Identifying Popular Online News: An Approach Using Chaotic Cuckoo Search Algorithm , 2017, 2017 2nd International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS).

[89]  Zheyi Chen,et al.  Detecting spammers on social networks , 2015, Neurocomputing.

[90]  Amir Hossein Gandomi,et al.  Chaotic Krill Herd algorithm , 2014, Inf. Sci..

[91]  V. Jothiprakash,et al.  Optimization of Hydropower Reservoir Using Evolutionary Algorithms Coupled with Chaos , 2013, Water Resources Management.

[92]  Victoria L. Crittenden,et al.  We're all connected: The power of the social media ecosystem , 2011 .

[93]  Alex Hai Wang,et al.  Don't follow me: Spam detection in Twitter , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).

[94]  Yogesh Kumar Dwivedi,et al.  Social media content and product co-creation: an emerging paradigm , 2016, J. Enterp. Inf. Manag..

[95]  Satya Prakash Ghrera,et al.  Identifying buzz in social media: a hybrid approach using artificial bee colony and k-nearest neighbors for outlier detection , 2017, Social Network Analysis and Mining.

[96]  Carl J. Case,et al.  Twitter Usage in the Fortune 50: A Marketing Opportunity? , 2011 .

[97]  Arpan Kumar Kar,et al.  A Review of Bio-Inspired Computing Methods and Potential Applications , 2016 .

[98]  Xin-She Yang,et al.  Firefly Algorithm: Recent Advances and Applications , 2013, ArXiv.

[99]  Hefu Liu,et al.  The role of social media in supporting knowledge integration: A social capital analysis , 2013, Information Systems Frontiers.