Bot prediction on social networks of Twitter in altmetrics using deep graph convolutional networks

In the context of smart cities, it is crucial to filter out falsified information spread on social media channels through paid campaigns or bot-user accounts that significantly influence communication networks across the social communities and may affect smart decision-making by the citizens. In this paper, we focus on two major aspects of the Twitter social network associated with altmetrics: (a) to analyze the properties of bots on Twitter networks and (b) to distinguish between bots and human accounts. Firstly, we employed state-of-the-art social network analysis techniques that exploit Twitter’s social network properties in novel altmetrics data. We found that 87% of tweets are affected by bots that are involved in the network’s dominant communities. We also found that, to some extent, community size and the degree of distribution in Twitter’s altmetrics network follow a power-law distribution. Furthermore, we applied a deep learning model, graph convolutional networks, to distinguish between organic (human) and bot Twitter accounts. The deployed model achieved the promising results, providing up to 71% classification accuracy over 200 epochs. Overall, the study concludes that bot presence in altmetrics-associated social media platforms can artificially inflate the number of social usage counts. As a result, special attention is required to eliminate such discrepancies when using altmetrics data for smart decision-making, such as research assessment either independently or complementary along with traditional bibliometric indices.

[1]  Miltiadis D. Lytras,et al.  Editorial - Advances in Research in Social Networking for Open and Distributed Learning , 2017 .

[2]  Johan Bollen,et al.  How the Scientific Community Reacts to Newly Submitted Preprints: Article Downloads, Twitter Mentions, and Citations , 2012, PloS one.

[3]  Kim Holmberg,et al.  Why do some research articles receive more online attention and higher altmetrics? Reasons for online success according to the authors , 2018, Scientometrics.

[4]  David W. McDonald,et al.  Dissecting a Social Botnet: Growth, Content and Influence in Twitter , 2015, CSCW.

[5]  AbdulMalik S. Al-Salman,et al.  Twitter turing test: Identifying social machines , 2016, Inf. Sci..

[6]  Boumediene Belkhouche,et al.  Semantic Twitter sentiment analysis based on a fuzzy thesaurus , 2018, Soft Comput..

[7]  Samaher AlJanabi,et al.  Pragmatic Method Based on Intelligent Big Data Analytics to Prediction Air Pollution , 2019, Big Data and Networks Technologies.

[8]  Sanjay Singh,et al.  Detection of fake Twitter followers using graph centrality measures , 2016, 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I).

[9]  Erdong Chen,et al.  Facebook immune system , 2011, SNS '11.

[10]  Sophia Ananiadou,et al.  Enriching news events with meta-knowledge information , 2016, Language Resources and Evaluation.

[11]  Vincent Larivière,et al.  Scholarly use of social media and altmetrics: A review of the literature , 2016, J. Assoc. Inf. Sci. Technol..

[12]  M. Lytras,et al.  Irregular migratory flows: Towards an ICTs’ enabled integrated framework for resilient urban systems , 2017 .

[13]  Anna Visvizi,et al.  Tweeting and mining OECD-related microcontent in the post-truth era: A cloud-based app , 2020, Comput. Hum. Behav..

[14]  Sushil Jajodia,et al.  Who is tweeting on Twitter: human, bot, or cyborg? , 2010, ACSAC '10.

[15]  Samaher Al-Janabi,et al.  A nifty collaborative analysis to predicting a novel tool (DRFLLS) for missing values estimation , 2019, Soft Computing.

[16]  Samaher AlJanabi,et al.  Smart system to create an optimal higher education environment using IDA and IOTs , 2018, International Journal of Computers and Applications.

[17]  Rodrigo Costas,et al.  Do “altmetrics” correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective , 2014, J. Assoc. Inf. Sci. Technol..

[18]  Teresa Alsinet,et al.  A distributed argumentation algorithm for mining consistent opinions in weighted Twitter discussions , 2018, Soft Comput..

[19]  Filippo Menczer,et al.  The rise of social bots , 2014, Commun. ACM.

[20]  Saeed-Ul Hassan,et al.  Influential tweeters in relation to highly cited articles in altmetric big data , 2019, Scientometrics.

[21]  Vincent Larivière,et al.  Tweets as impact indicators: Examining the implications of automated “bot” accounts on Twitter , 2014, J. Assoc. Inf. Sci. Technol..

[22]  W. Daamen,et al.  Using Social Media for Attendees Density Estimation in City-Scale Events , 2018, IEEE Access.

[23]  Jon Crowcroft,et al.  Classification of Twitter Accounts into Automated Agents and Human Users , 2017, 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[24]  Chi Zhang,et al.  TrueTop: A Sybil-Resilient System for User Influence Measurement on Twitter , 2015, IEEE/ACM Transactions on Networking.

[25]  Filippo Menczer,et al.  Online Human-Bot Interactions: Detection, Estimation, and Characterization , 2017, ICWSM.

[26]  Stefanie Haustein,et al.  Scholarly Twitter metrics , 2018, Springer Handbook of Science and Technology Indicators.

[27]  Hossam Faris,et al.  Evolving Support Vector Machines using Whale Optimization Algorithm for spam profiles detection on online social networks in different lingual contexts , 2018, Knowl. Based Syst..

[28]  Patric R. Spence,et al.  Is that a bot running the social media feed? Testing the differences in perceptions of communication quality for a human agent and a bot agent on Twitter , 2014, Comput. Hum. Behav..

[29]  Emilio Ferrara,et al.  Social Bots Distort the 2016 US Presidential Election Online Discussion , 2016, First Monday.

[30]  M. Lytras,et al.  Policy making for smart cities: innovation and social inclusive economic growth for sustainability , 2018, Journal of Science and Technology Policy Management.

[31]  Samaher AlJanabi,et al.  Multi Objectives Optimization to Gas Flaring Reduction from Oil Production , 2019, Big Data and Networks Technologies.

[32]  Jon Crowcroft,et al.  Stweeler: A Framework for Twitter Bot Analysis , 2016, WWW.

[33]  M. Lytras,et al.  Who Uses Smart City Services and What to Make of It: Toward Interdisciplinary Smart Cities Research , 2018, Sustainability.

[34]  Michael Sirivianos,et al.  Aiding the Detection of Fake Accounts in Large Scale Social Online Services , 2012, NSDI.

[35]  Chao Yang,et al.  Empirical Evaluation and New Design for Fighting Evolving Twitter Spammers , 2011, IEEE Transactions on Information Forensics and Security.

[36]  Filippo Menczer,et al.  Arming the public with artificial intelligence to counter social bots , 2019, Human Behavior and Emerging Technologies.

[37]  Vijay V. Raghavan,et al.  Big Data and Data Analytics Research: From Metaphors to Value Space for Collective Wisdom in Human Decision Making and Smart Machines , 2017, Int. J. Semantic Web Inf. Syst..

[38]  Silvio Lattanzi,et al.  SoK: The Evolution of Sybil Defense via Social Networks , 2013, 2013 IEEE Symposium on Security and Privacy.

[39]  Amos Azaria,et al.  The DARPA Twitter Bot Challenge , 2016, Computer.

[40]  Özlem Aktaş,et al.  Twitter fake account detection , 2017, 2017 International Conference on Computer Science and Engineering (UBMK).

[41]  Daniel Dajun Zeng,et al.  Behavior enhanced deep bot detection in social media , 2017, 2017 IEEE International Conference on Intelligence and Security Informatics (ISI).

[42]  Sophia Ananiadou,et al.  Facilitating the Analysis of Discourse Phenomena in an Interoperable NLP Platform , 2013, CICLing.

[43]  Fereshteh Didegah,et al.  Measuring social media activity of scientific literature: an exhaustive comparison of scopus and novel altmetrics big data , 2017, Scientometrics.

[44]  Miltiadis D. Lytras,et al.  Enabling Technologies and Business Infrastructures for Next Generation Social Media: Big Data, Cloud Computing, Internet of Things and Virtual Reality , 2015, J. Univers. Comput. Sci..

[45]  Ali Daud,et al.  CC-GA: A clustering coefficient based genetic algorithm for detecting communities in social networks , 2018, Appl. Soft Comput..

[46]  Zohreh Zahedi,et al.  On the relationships between bibliographic characteristics of scientific documents and citation and Mendeley readership counts: A large-scale analysis of Web of Science publications , 2017, J. Informetrics.

[47]  Rabeeh Ayaz Abbasi,et al.  Mining network-level properties of Twitter altmetrics data , 2019, Scientometrics.

[48]  Ahmed Patel,et al.  Empirical rapid and accurate prediction model for data mining tasks in cloud computing environments , 2014, 2014 International Congress on Technology, Communication and Knowledge (ICTCK).

[49]  Sophia Ananiadou,et al.  Enhancing Search: Events and Their Discourse Context , 2013, CICLing.

[50]  M. Lytras,et al.  Rescaling and refocusing smart cities research: from mega cities to smart villages , 2018, Journal of Science and Technology Policy Management.

[51]  Ahmed Patel,et al.  Rapid lossless compression of short text messages , 2015, Comput. Stand. Interfaces.

[52]  Jinyuan Jia,et al.  Random Walk Based Fake Account Detection in Online Social Networks , 2017, 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[53]  Jianxin Li,et al.  Sentiment analysis and spam detection in short informal text using learning classifier systems , 2017, Soft Computing.

[54]  Yogesh Kumar Dwivedi,et al.  Smart cities: Advances in research - An information systems perspective , 2019, Int. J. Inf. Manag..

[55]  Fereshteh Didegah,et al.  Investigating the quality of interactions and public engagement around scientific papers on Twitter , 2018, J. Informetrics.

[56]  Evelyn H. Thrasher,et al.  The next wave of innovation - Review of smart cities intelligent operation systems , 2017, Comput. Hum. Behav..

[57]  Lada A. Adamic,et al.  Power-Law Distribution of the World Wide Web , 2000, Science.

[58]  Miltiades D. Lytras,et al.  Big Data and Their Social Impact: Preliminary Study , 2019, Sustainability.

[59]  Khawar Khurshid,et al.  An expert system for diabetes prediction using auto tuned multi-layer perceptron , 2017, 2017 Intelligent Systems Conference (IntelliSys).

[60]  Jason Priem,et al.  How and why scholars cite on Twitter , 2010, ASIST.

[61]  Miltiadis D. Lytras,et al.  Social Networks Research for Sustainable Smart Education , 2018, Sustainability.

[62]  N.H. Kaghed,et al.  Design and Implementation of Classification System for Satellite Images based on Soft Computing Techniques , 2006, 2006 2nd International Conference on Information & Communication Technologies.

[63]  AlsalehMansour,et al.  Twitter turing test , 2016 .

[64]  Vincent Larivière,et al.  Tweeting biomedicine: An analysis of tweets and citations in the biomedical literature , 2013, J. Assoc. Inf. Sci. Technol..

[65]  Sophia Ananiadou,et al.  Identification of Manner in Bio-Events , 2012, LREC.

[66]  Yong Yu,et al.  Detecting Marionette Microblog Users for Improved Information Credibility , 2013, ECML/PKDD.

[67]  Miltiadis D. Lytras,et al.  Annotation of Smart Cities Twitter Micro-Contents for Enhanced Citizen’s Engagement , 2019, IEEE Access.

[68]  Murat Can Ganiz,et al.  Preprocessing framework for Twitter bot detection , 2017, 2017 International Conference on Computer Science and Engineering (UBMK).

[69]  Saeed-Ul Hassan,et al.  Bibliometric-enhanced information retrieval: a novel deep feature engineering approach for algorithm searching from full-text publications , 2019, Scientometrics.

[70]  Sophia Ananiadou,et al.  Identification of research hypotheses and new knowledge from scientific literature , 2018, BMC Medical Informatics and Decision Making.

[71]  Samaher AlJanabi,et al.  The Reality and Future of the Secure Mobile Cloud Computing (SMCC): Survey , 2019, Big Data and Networks Technologies.

[72]  Samaher Al-Janabi,et al.  Evaluation prediction techniques to achievement an optimal biomedical analysis , 2019 .

[73]  Amir Herzberg,et al.  Ethical Considerations when Employing Fake Identities in Online Social Networks for Research , 2014, Sci. Eng. Ethics.