Identifying buzz in social media: a hybrid approach using artificial bee colony and k-nearest neighbors for outlier detection

The exponential growth in the use of social media has not only impacted the way individuals communicate and interact but has also opened new avenues for various domains including health care, marketing, e-commerce, e-governance and politics to name a few. It has been further seen that such engagements result in huge amount of user-generated content (UGC) from both individuals and organizations combined. This UGC can be analyzed in multiple ways to mine useful information. One such popular domain that uses this information is content buzz/popularity. The content shared on social media platforms becomes popular and subsequently viral when shared and propagated by a larger audience at a faster pace. Organizations are leveraging this power of social media in the domain of content buzz and virality by employing various buzz monitoring techniques to boost the reach of their content. This study thus proposes a hybrid artificial bee colony approach integrated with k-nearest neighbors to identify and segregate buzz in Twitter. A set of metrics comprising of created discussions, increase in authors, attention level, burstiness level, contribution sparseness, author interaction, author count and average length of discussions are used to model the buzz. The proposed approach considers the buzz discussions as outliers deviating from the normal discussions and identifies the same using the proposed hybrid bio-inspired approach. Findings may be useful in domains like e-commerce, digital and influencer marketing to explore the factors that might create buzz along with the difference between the impact of buzz and normal discussions on the consumers.

[1]  D. Murthy Twitter and elections: are tweets, predictive, reactive, or a form of buzz? , 2015 .

[2]  Xin-She Yang,et al.  Flower pollination algorithm: A novel approach for multiobjective optimization , 2014, ArXiv.

[3]  Gözde Özbal,et al.  Exploring Text Virality in Social Networks , 2011, ICWSM.

[4]  Pengtao Xie,et al.  Strategies and Principles of Distributed Machine Learning on Big Data , 2015, ArXiv.

[5]  D. Hoffman,et al.  Can You Measure the ROI of Your Social Media Marketing , 2010 .

[6]  Devi Parikh,et al.  Understanding image virality , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  L. Huang,et al.  Social Media in an Alternative Marketing Communication Model , 2012 .

[8]  Adam J. Mills Virality in social media: the SPIN Framework , 2012 .

[9]  A. Vespignani,et al.  Competition among memes in a world with limited attention , 2012, Scientific Reports.

[10]  Athanasios V. Vasilakos,et al.  Revealing the efficiency of information diffusion in online social networks of microblog , 2015, Inf. Sci..

[11]  Victoria L. Crittenden,et al.  We're all connected: The power of the social media ecosystem , 2011 .

[12]  Jian Ma,et al.  A new approach to intrusion detection using Artificial Neural Networks and fuzzy clustering , 2010, Expert Syst. Appl..

[13]  Jonah Berger,et al.  Virality: What Gets Shared and Why , 2010 .

[14]  P. Vigneswara Ilavarasan,et al.  Exploring Content Virality in Facebook: A Semantic Based Approach , 2017, I3E.

[15]  Marie-Aude Aufaure,et al.  A Buzz and E-Reputation Monitoring Tool for Twitter Based on Galois Lattices , 2011, ICCS.

[16]  Arpan Kumar Kar,et al.  Swarm Intelligence: A Review of Algorithms , 2017 .

[17]  A. Kaplan,et al.  Users of the world, unite! The challenges and opportunities of Social Media , 2010 .

[18]  Francesco Bonchi,et al.  The Meme Ranking Problem: Maximizing Microblogging Virality , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[19]  Marco Guerini,et al.  Exploring Image Virality in Google Plus , 2013, 2013 International Conference on Social Computing.

[20]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[21]  Simeon Edosomwan,et al.  The History of Social Media and its Impact on Business , 2011 .

[22]  Nirvana Meratnia,et al.  Outlier Detection Techniques for Wireless Sensor Networks: A Survey , 2008, IEEE Communications Surveys & Tutorials.

[23]  Dries F. Benoit,et al.  Identifying influencers in a social network: The value of real referral data , 2016, Decis. Support Syst..

[24]  Rajeev Rastogi,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD 2000.

[25]  Filippo Menczer,et al.  Virality Prediction and Community Structure in Social Networks , 2013, Scientific Reports.

[26]  Jie Jennifer Zhang,et al.  How Do Consumer Buzz and Traffic in Social Media Marketing Predict the Value of the Firm? , 2013, J. Manag. Inf. Syst..

[27]  G. S. O'Keeffe,et al.  The Impact of Social Media on Children, Adolescents, and Families , 2011, Pediatrics.

[28]  Taufik Abrão,et al.  Anomaly Detection Using Metaheuristic Firefly Harmonic Clustering , 2013, J. Networks.

[29]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[30]  Dervis Karaboga,et al.  A comprehensive survey: artificial bee colony (ABC) algorithm and applications , 2012, Artificial Intelligence Review.

[31]  Dervis Karaboga,et al.  A comparative study of Artificial Bee Colony algorithm , 2009, Appl. Math. Comput..

[32]  A. Kaplan,et al.  Two hearts in three-quarter time: How to waltz the social media/viral marketing dance , 2011 .

[33]  Filippo Menczer,et al.  Predicting Successful Memes Using Network and Community Structure , 2014, ICWSM.

[34]  Filippo Menczer,et al.  Erratum: Competition among memes in a world with limited attention , 2013, Scientific Reports.

[35]  Wei Fan,et al.  Mining big data: current status, and forecast to the future , 2013, SKDD.

[36]  Xin-She Yang,et al.  Firefly Algorithms for Multimodal Optimization , 2009, SAGA.

[37]  Behzad Moshiri,et al.  Anomaly detection using a self-organizing map and particle swarm optimization , 2011, Sci. Iran..

[38]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[39]  S. P. Ghrera,et al.  Outlier Detection Among Influencer Blogs Based on off-Site Web Analytics Data , 2017, I3E.

[40]  P. Gloor,et al.  Predicting Asset Value through Twitter Buzz , 2012 .

[41]  Xin-She Yang,et al.  A New Metaheuristic Bat-Inspired Algorithm , 2010, NICSO.

[42]  Imran Chowdhury,et al.  Information Communities: The Network Structure of Communication , 2012, Soc. Networks.

[43]  Paul DiMaggio,et al.  From the 'Digital Divide' to 'Digital Inequality': Studying Internet Use as Penetration Increases , 2001 .

[44]  D. Karaboga,et al.  On the performance of artificial bee colony (ABC) algorithm , 2008, Appl. Soft Comput..

[45]  Russell C. Eberhart,et al.  Parameter Selection in Particle Swarm Optimization , 1998, Evolutionary Programming.

[46]  Salvatore J. Stolfo,et al.  Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.

[47]  Xin-She Yang,et al.  Flower Pollination Algorithm for Global Optimization , 2012, UCNC.

[48]  Ajith Abraham,et al.  Bacterial Foraging Optimization Algorithm: Theoretical Foundations, Analysis, and Applications , 2009, Foundations of Computational Intelligence.

[49]  Alexander Benlian,et al.  Understanding the Dynamic Interplay of Social Buzz and Contribution Behavior within and between Online Platforms - Evidence from Crowdfunding , 2014, ICIS.

[50]  Sylvio Barbon Junior,et al.  Using Ant Colony Optimization metaheuristic and Dynamic Time Warping for anomaly detection , 2013, 2013 21st International Conference on Software, Telecommunications and Computer Networks - (SoftCOM 2013).

[51]  Yuval Elovici,et al.  Friend or foe? Fake profile identification in online social networks , 2013, Social Network Analysis and Mining.

[52]  Arpan Kumar Kar,et al.  A Review of Bio-Inspired Computing Methods and Potential Applications , 2016 .

[53]  Yong Hu,et al.  The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature , 2011, Decis. Support Syst..

[54]  S. P. Ghrera,et al.  A Novel Approach to Outlier Detection using Modified Grey Wolf Optimization and k-Nearest Neighbors Algorithm , 2016 .

[55]  San Murugesan,et al.  Understanding Web 2.0 , 2007, IT Professional.

[56]  Tia Fisher,et al.  ROI in social media: A look at the arguments , 2009 .

[57]  Damon Centola,et al.  The Spread of Behavior in an Online Social Network Experiment , 2010, Science.

[58]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[59]  Gang Zhang,et al.  Quantitative assessment on the cloning efficiencies of lentiviral transfer vectors with a unique clone site , 2012, Scientific Reports.

[60]  Ee-Peng Lim,et al.  Influentials, Novelty, and Social Contagion: The Viral Power of Average Friends, Close Communities, and Old News , 2012, Soc. Networks.

[61]  Dan Boneh,et al.  On genetic algorithms , 1995, COLT '95.

[62]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[63]  K. Goh,et al.  Social Media Brand Community and Consumer Behavior: Quantifying the Relative Impact of User- and Marketer-Generated Content , 2013 .

[64]  Yogesh Kumar Dwivedi,et al.  Search engine marketing is not all gold: Insights from Twitter and SEOClerks , 2018, Int. J. Inf. Manag..

[65]  B. Chae,et al.  Insights from hashtag #supplychain and Twitter Analytics: Considering Twitter and Twitter data for supply chain practice and research , 2015 .

[66]  Gordon Bell,et al.  Beyond the Data Deluge , 2009, Science.

[67]  Ana-Maria Popescu,et al.  Detecting controversial events from twitter , 2010, CIKM.

[68]  Calton Pu,et al.  SPADE: a social-spam analytics and detection framework , 2014, Social Network Analysis and Mining.

[69]  Marco Dorigo,et al.  Ant colony optimization theory: A survey , 2005, Theor. Comput. Sci..

[70]  Michael F. Goodchild,et al.  Citizens as Voluntary Sensors: Spatial Data Infrastructure in the World of Web 2.0 , 2007, Int. J. Spatial Data Infrastructures Res..

[71]  P. Berthon,et al.  Marketing meets Web 2.0, social media, and creative consumers: Implications for international marketing strategy , 2012 .

[72]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[73]  Dervis Karaboga,et al.  A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm , 2007, J. Glob. Optim..

[74]  Hiroyuki Ohsaki,et al.  On the relation between message sentiment and its virality on social media , 2017, Social Network Analysis and Mining.

[75]  Xin-She Yang,et al.  Cuckoo Search via Lévy flights , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[76]  Arpan Kumar Kar,et al.  Bio inspired computing - A review of algorithms and scope of applications , 2016, Expert Syst. Appl..

[77]  Stephen Grossberg,et al.  Nonlinear neural networks: Principles, mechanisms, and architectures , 1988, Neural Networks.

[78]  Andrew Lewis,et al.  Grey Wolf Optimizer , 2014, Adv. Eng. Softw..

[79]  Dervis Karaboga,et al.  Artificial Bee Colony (ABC) Optimization Algorithm for Solving Constrained Optimization Problems , 2007, IFSA.

[80]  A. Hausmann Creating "Buzz": Opportunities and Limitations of Social Media for Arts Institutions and their Viral Marketing , 2012 .

[81]  Mehrdad Tamiz,et al.  Multi-objective meta-heuristics: An overview of the current state-of-the-art , 2002, Eur. J. Oper. Res..

[82]  Y. Melanthiou,et al.  Social Media: Marketing Public Relations’ New Best Friend , 2012 .

[83]  Murat Kantarcioglu,et al.  Detecting anomalies in social network data consumption , 2014, Social Network Analysis and Mining.

[84]  Wail S. Elkilani,et al.  A hybrid approach for efficient anomaly detection using metaheuristic methods , 2014, Journal of advanced research.

[85]  Arpan Kumar Kar,et al.  Big Data Analytics: A Review on Theoretical Contributions and Tools Used in Literature , 2017, Global Journal of Flexible Systems Management.

[86]  Micael Dahlen,et al.  Following the Fashionable Friend: The Power of Social Media , 2011, Journal of Advertising Research.

[87]  Anuja Arora,et al.  Brand analysis framework for online marketing: ranking web pages and analyzing popularity of brands on social media , 2017, Social Network Analysis and Mining.

[88]  A. J. Morales,et al.  Efficiency of human activity on information spreading on Twitter , 2014, Soc. Networks.

[89]  François Kawala,et al.  Prédictions d'activité dans les réseaux sociaux en ligne , 2013 .

[90]  Dervis Karaboga,et al.  A novel clustering approach: Artificial Bee Colony (ABC) algorithm , 2011, Appl. Soft Comput..

[91]  Michel Gendreau,et al.  Metaheuristics in Combinatorial Optimization , 2022 .

[92]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[93]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.