Critical reflections on three popular computational linguistic approaches to examine Twitter discourses

Although computational linguistic methods—such as topic modelling, sentiment analysis and emotion detection—can provide social media researchers with insights into online public discourses, it is not inherent as to how these methods should be used, with a lack of transparent instructions on how to apply them in a critical way. There is a growing body of work focusing on the strengths and shortcomings of these methods. Through applying best practices for using these methods within the literature, we focus on setting expectations, presenting trajectories, examining with context and critically reflecting on the diachronic Twitter discourse of two case studies: the longitudinal discourse of the NHS Covid-19 digital contact-tracing app and the snapshot discourse of the Ofqual A Level grade calculation algorithm, both related to the UK. We identified difficulties in interpretation and potential application in all three of the approaches. Other shortcomings, such the detection of negation and sarcasm, were also found. We discuss the need for further transparency of these methods for diachronic social media researchers, including the potential for combining these approaches with qualitative ones—such as corpus linguistics and critical discourse analysis—in a more formal framework.

[1]  José Ramón Saura,et al.  Exploring the challenges of remote work on Twitter users' sentiments: From digital technology development to a post-pandemic era , 2022, Journal of Business Research.

[2]  Nisheeth Joshi,et al.  A Review on Negation Role in Twitter Sentiment Analysis , 2021, Int. J. Heal. Inf. Syst. Informatics.

[3]  Andrea L. Howard A Guide to Visualizing Trajectories of Change With Confidence Bands and Raw Data , 2021, Advances in Methods and Practices in Psychological Science.

[4]  K. Abbasi Covid-19: The UK’s political gamble that bodes ill for health and the health service , 2021, BMJ.

[5]  A. Rimmer Sixty seconds on . . . the pingdemic , 2021, BMJ.

[6]  Derek McAuley,et al.  Public Adoption and Trust in the Covid-19 Contact Tracing App in the UK: A survey study. , 2021, Journal of medical Internet research.

[7]  Gisela Vallejo,et al.  Tracing Contacts With Mobile Phones to Curb the Pandemic:Topics and Stances in People’s Online Comments About the Official German Contact-Tracing App , 2021, CHI Extended Abstracts.

[8]  David A. Joyner,et al.  Towards Mutual Theory of Mind in Human-AI Interaction: How Language Reflects What Students Perceive About a Virtual Teaching Assistant , 2021, CHI.

[9]  S. McLennan,et al.  COVID-19 contact tracing apps: UK public perceptions , 2021, Critical public health.

[10]  M. Mbwogge Mass Testing With Contact Tracing Compared to Test and Trace for the Effective Suppression of COVID-19 in the United Kingdom: Systematic Review , 2021, JMIRx med.

[11]  A. Aribowo,et al.  Implementation Of Text Mining For Emotion Detection Using The Lexicon Method (Case Study: Tweets About Covid-19) , 2021, Telematika.

[12]  A. Kelly A tale of two algorithms: The appeal and repeal of calculated grades systems in England and Ireland in 2020 , 2021, British Educational Research Journal.

[13]  Wouter van Atteveldt,et al.  The Validity of Sentiment Analysis: Comparing Manual Annotation, Crowd-Coding, Dictionary Approaches, and Machine Learning Algorithms , 2021, Communication Methods and Measures.

[14]  J. L. Bender,et al.  Natural Language Processing–Based Virtual Cofacilitator for Online Cancer Support Groups: Protocol for an Algorithm Development and Validation Study , 2021, JMIR research protocols.

[15]  Rabindra Lamsal Design and analysis of a large-scale COVID-19 tweets dataset , 2020, Applied Intelligence.

[16]  Kohei Watanabe,et al.  Latent Semantic Scaling: A Semisupervised Text Analysis Technique for New Domains and Languages , 2020, Communication Methods and Measures.

[17]  O. S. Albahri,et al.  Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: A systematic review , 2020, Expert Systems with Applications.

[18]  Helen Smith,et al.  Algorithmic bias: should students pay the price? , 2020, AI Soc..

[19]  Mark Heitmann,et al.  More than a Feeling: Benchmarks for Sentiment Analysis Accuracy , 2020, SSRN Electronic Journal.

[20]  Ganna Rozhnova,et al.  Impact of delays on effectiveness of contact tracing strategies for COVID-19: a modelling study , 2020, The Lancet Public Health.

[21]  Bagus Wicaksono Arianto,et al.  Topic Modeling for Twitter Users Regarding the "Ruanggguru" Application , 2020 .

[22]  B. Pokharel Twitter Sentiment Analysis During Covid-19 Outbreak in Nepal , 2020 .

[23]  M A Muslim,et al.  Twitter text mining for sentiment analysis on government’s response to forest fires with vader lexicon polarity detection and k-nearest neighbor algorithm , 2020, Journal of Physics: Conference Series.

[24]  Purnima Kubde,et al.  Emotional Analysis using Twitter Data during Pandemic Situation: COVID-19 , 2020, 2020 5th International Conference on Communication and Electronics Systems (ICCES).

[25]  C. J. Armitage,et al.  Public attitudes towards COVID‐19 contact tracing apps: A UK‐based focus group study , 2020, medRxiv.

[26]  Sameer Singh,et al.  Beyond Accuracy: Behavioral Testing of NLP Models with CheckList , 2020, ACL.

[27]  Ching-Ying Sung,et al.  Supporting Online Video Learning with Concept Map-based Recommendation of Learning Path , 2020, CHI Extended Abstracts.

[28]  Amir Javed,et al.  A comparative analysis of detection mechanisms for emotion detection , 2019, Journal of Physics: Conference Series.

[29]  Dr.K.Mohan Kumar,et al.  Flock The Similar Users Of Twitter By Using Latent Dirichlet Allocation , 2019 .

[30]  Subhasree Sengupta,et al.  What are Academic Subreddits Talking About?: A Comparative Analysis of r/academia and r/gradschool , 2019, CSCW Companion.

[31]  Kenji Nagamatsu,et al.  Addressing Ambiguity of Emotion Labels Through Meta-Learning , 2019, ArXiv.

[32]  A. Dia,et al.  Analysis of the free caesarean section initiative at the Nabil Choucair Health Center , 2019, European Journal of Public Health.

[33]  D ChaithraV.,et al.  Hybrid approach: naive bayes and sentiment VADER for analyzing sentiment of mobile unboxing video comments , 2019, International Journal of Electrical and Computer Engineering (IJECE).

[34]  Maria Liakata,et al.  How We Do Things With Words: Analyzing Text as Social and Cultural Data , 2019, Frontiers in Artificial Intelligence.

[35]  Munmun De Choudhury,et al.  Characterizing Homelessness Discourse on Social Media , 2019, CHI Extended Abstracts.

[36]  R. Stine Sentiment Analysis , 2019, Annual Review of Statistics and Its Application.

[37]  Ahmad Fathan Hidayatullah,et al.  Topic modeling of weather and climate condition on twitter using latent dirichlet allocation (LDA) , 2019, IOP Conference Series: Materials Science and Engineering.

[38]  Ario Yudo Husodo,et al.  Twitter Sentiment Analysis using Na¨ive Bayes Classifier with Mutual Information Feature Selection , 2018, Journal of Computer Science and Informatics Engineering (J-Cosine).

[39]  Alessandro Bessi,et al.  Analyzing polarization of social media users and news sites during political campaigns , 2018, Social Network Analysis and Mining.

[40]  Damminda Alahakoon,et al.  Machine learning to support social media empowered patients in cancer care and cancer treatment decisions , 2018, PloS one.

[41]  Amita Goel,et al.  Twitter Sentiment Analysis using Vader , 2018 .

[42]  Haiyi Zhang,et al.  Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis , 2018 .

[43]  Kim-Kwang Raymond Choo,et al.  A model for sentiment and emotion analysis of unstructured social media text , 2018, Electron. Commer. Res..

[44]  Silke Adam,et al.  Applying LDA Topic Modeling in Communication Research: Toward a Valid and Reliable Methodology , 2018 .

[45]  Casey Fiesler,et al.  Understanding Diverse Interpretations of Animated GIFs , 2017, CHI Extended Abstracts.

[46]  Sergey I. Nikolenko,et al.  Topic modelling for qualitative studies , 2017, J. Inf. Sci..

[47]  David M. Mimno,et al.  Comparing Apples to Apple: The Effects of Stemmers on Topic Models , 2016, TACL.

[48]  Michael S. Bernstein,et al.  Empath: Understanding Topic Signals in Large-Scale Text , 2016, CHI.

[49]  Zhao Jianqiang,et al.  Pre-processing Boosting Twitter Sentiment Analysis? , 2015, 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity).

[50]  Emilio Ferrara,et al.  Style in the Age of Instagram: Predicting Success within the Fashion Industry using Social Media , 2015, CSCW.

[51]  Namita Mittal,et al.  Sentiment Analysis Using Common-Sense and Context Information , 2015, Comput. Intell. Neurosci..

[52]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[53]  Ali Shojaie,et al.  Using Twitter for Demographic and Social Science Research: Tools for Data Collection and Processing , 2014, Sociological methods & research.

[54]  Lay-Ki Soon,et al.  Natural Language Processing for Sentiment Analysis: An Exploratory Analysis on Tweets , 2014, 2014 4th International Conference on Artificial Intelligence with Applications in Engineering and Technology.

[55]  Huan Liu,et al.  Twitter Data Analytics , 2013, SpringerBriefs in Computer Science.

[56]  Grant Blank,et al.  Blurring the Boundaries? New social media, new social research: Developing a network to explore the issues faced by researchers negotiating the new research landscape of online social media platforms , 2013 .

[57]  Saif Mohammad,et al.  CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON , 2013, Comput. Intell..

[58]  Julia d. Robinson Sixty Seconds with… , 2012, Psych-Talk.

[59]  Petr Sojka,et al.  Gensim -- Statistical Semantics in Python , 2011 .

[60]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[61]  Nina Wacholder,et al.  Identifying Sarcasm in Twitter: A Closer Look , 2011, ACL.

[62]  Johan Bollen,et al.  Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena , 2009, ICWSM.

[63]  Paul Dourish,et al.  Reflective HCI: articulating an agenda for critical practice , 2006, CHI Extended Abstracts.

[64]  A. Viera,et al.  Understanding interobserver agreement: the kappa statistic. , 2005, Family medicine.

[65]  Yee Whye Teh,et al.  Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes , 2004, NIPS.

[66]  R. Srinath,et al.  COVID-19 Vaccine –Public Sentiment Analysis Using Python’s Textblob Approach , 2021, International Journal of Current Research and Review.

[67]  Vimala Balakrishnan,et al.  String-based Multinomial Naïve Bayes for Emotion Detection among Facebook Diabetes Community , 2019, KES.

[68]  Pantjawarni Prihatini,et al.  Feature extraction for document text using Latent Dirichlet Allocation , 2018 .

[69]  V. Uma,et al.  An Extensive study of Sentiment Analysis tools and Binary Classification of tweets using Rapid Miner , 2018 .

[70]  Rada Mihalcea,et al.  Sentiment Analysis , 2014, Encyclopedia of Social Network Analysis and Mining.

[71]  Julio Villena Román,et al.  TASS 2013 - Workshop on Sentiment Analysis at SEPLN 2013: An overview , 2013 .

[72]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[73]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[74]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[75]  L. Finlay,et al.  Reflecting on ‘Reflective practice’ , 2008 .

[76]  Thomas J. Watson,et al.  An empirical study of the naive Bayes classifier , 2001 .

[77]  G. Gibbs Learning by doing: A guide to teaching and learning methods , 1988 .