Integrating Crowdsourcing and Active Learning for Classification of Work-Life Events from Tweets

Social media, especially Twitter, is being increasingly used for research with predictive analytics. In social media studies, natural language processing (NLP) techniques are used in conjunction with expert-based, manual and qualitative analyses. However, social media data are unstructured and must undergo complex manipulation for research use. The manual annotation is the most resource and time-consuming process that multiple expert raters have to reach consensus on every item, but is essential to create gold-standard datasets for training NLP-based machine learning classifiers. To reduce the burden of the manual annotation, yet maintaining its reliability, we devised a crowdsourcing pipeline combined with active learning strategies. We demonstrated its effectiveness through a case study that identifies job loss events from individual tweets. We used Amazon Mechanical Turk platform to recruit annotators from the Internet and designed a number of quality control measures to assure annotation accuracy. We evaluated 4 different active learning strategies (i.e., least confident, entropy, vote entropy, and Kullback-Leibler divergence). The active learning strategies aim at reducing the number of tweets needed to reach a desired performance of automated classification. Results show that crowdsourcing is useful to create high-quality annotations and active learning helps in reducing the number of required tweets, although there was no substantial difference among the strategies tested.

[1]  Mark Dredze,et al.  Annotating Named Entities in Twitter Data with Crowdsourcing , 2010, Mturk@HLT-NAACL.

[2]  Raghav Kaushik,et al.  On active learning of record matching packages , 2010, SIGMOD Conference.

[3]  Igor Mozetic,et al.  Multilingual Twitter Sentiment Classification: The Role of Human Annotators , 2016, PloS one.

[4]  Han Yu,et al.  Efficient Crowd-Powered Active Learning for Reliable Review Evaluation , 2017, ICCSE'17.

[5]  Jennifer Widom,et al.  CrowdScreen: algorithms for filtering data with humans , 2012, SIGMOD Conference.

[6]  Aditya G. Parameswaran,et al.  Active sampling for entity matching , 2012, KDD.

[7]  Jackie Chi Kit Cheung,et al.  Extractive vs. NLG-based Abstractive Summarization of Evaluative Text: The Effect of Corpus Controversiality , 2008, INLG.

[8]  Jiang Bian,et al.  An Operational Deep Learning Pipeline for Classifying Life Events from Individual Tweets , 2018, SIMBig.

[9]  Ruimao Zhang,et al.  Cost-Effective Active Learning for Deep Image Classification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Xiaofeng Wang,et al.  Automatic Crime Prediction Using Events Extracted from Twitter Posts , 2012, SBP.

[11]  Anh-Cuong Le,et al.  A Comparative Study of Neural Network Models for Sentence Classification , 2018, 2018 5th NAFOSTED Conference on Information and Computer Science (NICS).

[12]  Vasudeva Varma,et al.  Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[13]  Min Tang,et al.  Active Learning for Statistical Natural Language Parsing , 2002, ACL.

[14]  Jiang Bian,et al.  Using Social Media Data to Understand the Impact of Promotional Information on Laypeople’s Discussions: A Case Study of Lynch Syndrome , 2017, Journal of medical Internet research.

[15]  Yi Guo,et al.  Assessing Mental Health Signals Among Sexual and Gender Minorities using Twitter Data , 2018, 2018 IEEE International Conference on Healthcare Informatics Workshop (ICHI-W).

[16]  Gregory J. Park,et al.  Psychological Language on Twitter Predicts County-Level Heart Disease Mortality , 2015, Psychological science.

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  M. Mayer,et al.  Detecting Signs of Depression in Tweets in Spanish: Behavioral and Linguistic Analysis , 2019, Journal of medical Internet research.

[19]  Daniel Gayo-Avello,et al.  A Meta-Analysis of State-of-the-Art Electoral Prediction From Twitter Data , 2012, ArXiv.

[20]  David R. Karger,et al.  Counting with the Crowd , 2012, Proc. VLDB Endow..

[21]  Kristen Grauman,et al.  Cost-Sensitive Active Visual Category Learning , 2010, International Journal of Computer Vision.

[22]  Marina Kogan,et al.  Developing and Evaluating Annotation Procedures for Twitter Data during Hazard Events , 2018, LAW-MWE-CxG@COLING.

[23]  Yanfang Ye,et al.  Adverse event detection by integrating twitter data and VAERS , 2018, Journal of Biomedical Semantics.

[24]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[25]  Purnamrita Sarkar,et al.  Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning , 2014, Proc. VLDB Endow..

[26]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[27]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[28]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[29]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[30]  Marco Loog,et al.  A variance maximization criterion for active learning , 2017, Pattern Recognit..