Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task

Abstract Objective We executed the Social Media Mining for Health (SMM4H) 2017 shared tasks to enable the community-driven development and large-scale evaluation of automatic text processing methods for the classification and normalization of health-related text from social media. An additional objective was to publicly release manually annotated data. Materials and Methods We organized 3 independent subtasks: automatic classification of self-reports of 1) adverse drug reactions (ADRs) and 2) medication consumption, from medication-mentioning tweets, and 3) normalization of ADR expressions. Training data consisted of 15 717 annotated tweets for (1), 10 260 for (2), and 6650 ADR phrases and identifiers for (3); and exhibited typical properties of social-media-based health-related texts. Systems were evaluated using 9961, 7513, and 2500 instances for the 3 subtasks, respectively. We evaluated performances of classes of methods and ensembles of system combinations following the shared tasks. Results Among 55 system runs, the best system scores for the 3 subtasks were 0.435 (ADR class F1-score) for subtask-1, 0.693 (micro-averaged F1-score over two classes) for subtask-2, and 88.5% (accuracy) for subtask-3. Ensembles of system combinations obtained best scores of 0.476, 0.702, and 88.7%, outperforming individual systems. Discussion Among individual systems, support vector machines and convolutional neural networks showed high performance. Performance gains achieved by ensembles of system combinations suggest that such strategies may be suitable for operational systems relying on difficult text classification tasks (eg, subtask-1). Conclusions Data imbalance and lack of context remain challenges for natural language processing of social media text. Annotated data from the shared task have been made available as reference standards for future studies (http://dx.doi.org/10.17632/rxwfb3tysd.1).

[1]  Arjun Magge,et al.  CSaRUS-CNN at AMIA-2017 Tasks 1, 2: Under Sampled CNN for Text Classification , 2017, SMM4H@AMIA.

[2]  Abeed Sarker,et al.  Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features , 2015, J. Am. Medical Informatics Assoc..

[3]  Byron C. Wallace,et al.  Detecting Twitter posts with Adverse Drug Reactions using Convolutional Neural Networks , 2017, SMM4H@AMIA.

[4]  Graciela Gonzalez-Hernandez,et al.  Pharmacovigilance on Twitter? Mining Tweets for Adverse Drug Reactions , 2014, AMIA.

[5]  Yanling Li,et al.  Data Imbalance Problem in Text Classification , 2010, 2010 Third International Symposium on Information Processing.

[6]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[7]  G Savova,et al.  Capturing the Patient’s Perspective: a Review of Advances in Natural Language Processing of Health-Related Text , 2017, Yearbook of Medical Informatics.

[8]  Abeed Sarker,et al.  Hybrid Semantic Analysis for Mapping Adverse Drug Reaction Mentions in Tweets to Medical Terminology , 2017, AMIA.

[9]  D Demner-Fushman,et al.  Aspiring to Unintended Consequences of Natural Language Processing: A Review of Recent Developments in Clinical and Consumer-Generated Text Processing , 2016, Yearbook of Medical Informatics.

[10]  Abeed Sarker,et al.  Overview of the Second Social Media Mining for Health (SMM4H) Shared Tasks at AMIA 2017 , 2017, SMM4H@AMIA.

[11]  Mizuki Morita,et al.  Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter , 2011, EMNLP.

[12]  Mark Dredze,et al.  Detecting Changes in Suicide Content Manifested in Social Media Following Celebrity Suicides , 2015, HT.

[13]  Goran Nenadic,et al.  Using an Ensemble of Linear and Deep Learning Models in the SMM4H 2017 Medical Concept Normalisation Task , 2017, SMM4H@AMIA.

[14]  E. Brown,et al.  The Medical Dictionary for Regulatory Activities (MedDRA) , 1999, Drug safety.

[15]  Emily Chia-Yu Su,et al.  NTTMU System in the 2nd Social Media Mining for Health Applications Shared Task , 2017, SMM4H@AMIA.

[16]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[17]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[18]  Mark Dredze,et al.  Shared Task : Depression and PTSD on Twitter , 2015 .

[19]  Bonggun Shin,et al.  Lexicon Integrated CNN Models with Attention for Sentiment Analysis , 2016, WASSA@EMNLP.

[20]  Marie-Christine Jaulent,et al.  OntoADR a semantic resource describing adverse drug reactions to support searching, coding, and information retrieval , 2016, J. Biomed. Informatics.

[21]  Ye Ye,et al.  Detection of Adverse Drug Reaction from Twitter Data , 2017, SMM4H@AMIA.

[22]  Abeed Sarker,et al.  Detecting Personal Medication Intake in Twitter: An Annotated Corpus and Baseline Classification System , 2017, BioNLP.

[23]  S Velupillai,et al.  Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis , 2015, Yearbook of Medical Informatics.

[24]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[25]  Berry de Bruijn,et al.  NRC-Canada at SMM4H Shared Task: Classifying Tweets Mentioning Adverse Drug Reactions and Medication Intake , 2018, SMM4H@AMIA.

[26]  Amy Beth Warriner,et al.  Norms of valence, arousal, and dominance for 13,915 English lemmas , 2013, Behavior Research Methods.

[27]  Wolfgang Nejdl,et al.  Introduction to the special section on twitter and microblogging services , 2013, TIST.

[28]  Tapio Salakoski,et al.  Ensemble of Convolutional Neural Networks for Medicine Intake Recognition in Twitter , 2017, SMM4H@AMIA.

[29]  Brendan T. O'Connor,et al.  Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters , 2013, NAACL.

[30]  Timothy Baldwin,et al.  Lexical normalization for social media text , 2013, TIST.

[31]  Abeed Sarker,et al.  Portable automatic text classification for adverse drug reaction detection via multi-corpus training , 2015, J. Biomed. Informatics.

[32]  Graciela Gonzalez-Hernandez,et al.  Utilizing social media data for pharmacovigilance: A review , 2015, J. Biomed. Informatics.

[33]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[34]  M. Shigematsu,et al.  Using Social Media for Actionable Disease Surveillance and Outbreak Management: A Systematic Literature Review , 2015, PloS one.

[35]  Gerjo Kok,et al.  Disease Detection or Public Opinion Reflection? Content Analysis of Tweets, Other Social Media, and Online Newspapers During the Measles Outbreak in the Netherlands in 2013 , 2015, Journal of medical Internet research.

[36]  Gerlof Bouma,et al.  Normalized (pointwise) mutual information in collocation extraction , 2009 .

[37]  L. Struik,et al.  The Role of Facebook in Crush the Crave, a Mobile- and Social Media-Based Smoking Cessation Intervention: Qualitative Framework Analysis of Posts , 2014, Journal of medical Internet research.

[38]  Saif Mohammad,et al.  Stance and Sentiment in Tweets , 2016, ACM Trans. Internet Techn..

[39]  Anne Cocos,et al.  Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts , 2017, J. Am. Medical Informatics Assoc..

[40]  Christopher M. Danforth,et al.  Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter , 2011, PloS one.

[41]  Saif Mohammad,et al.  Sentiment Analysis of Short Informal Texts , 2014, J. Artif. Intell. Res..

[42]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[43]  Abeed Sarker,et al.  A corpus for mining drug-related knowledge from Twitter chatter: Language models and their utilities , 2016, Data in brief.

[44]  Abeed Sarker,et al.  Social Media Mining Shared Task Workshop , 2016, PSB.

[45]  Mark Dredze,et al.  You Are What You Tweet: Analyzing Twitter for Public Health , 2011, ICWSM.

[46]  Lucila Ohno-Machado,et al.  Biomedical informatics and data science: evolving fields with significant overlap , 2018, J. Am. Medical Informatics Assoc..

[47]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[48]  Richard Bonneau,et al.  Text Classification for Automatic Detection of E-Cigarette Use and Use for Smoking Cessation from Twitter: A Feasibility Pilot , 2016, PSB.

[49]  Wesley De Neve,et al.  Multimedia Lab @ ACL WNUT NER Shared Task: Named Entity Recognition for Twitter Microposts using Distributed Word Representations , 2015, NUT@IJCNLP.

[50]  Jasper Friedrichs,et al.  InfyNLP at SMM4H Task 2: Stacked Ensemble of Shallow Convolutional Neural Networks for Identifying Personal Medication Intake from Twitter , 2018, SMM4H@AMIA.