A Comparison of Word-based and Context-based Representations for Classification Problems in Health Informatics

Distributed representations of text can be used as features when training a statistical classifier. These representations may be created as a composition of word vectors or as context-based sentence vectors. We compare the two kinds of representations (word versus context) for three classification problems: influenza infection classification, drug usage classification and personal health mention classification. For statistical classifiers trained for each of these problems, context-based representations based on ELMo, Universal Sentence Encoder, Neural-Net Language Model and FLAIR are better than Word2Vec, GloVe and the two adapted using the MESH ontology. There is an improvement of 2-4% in the accuracy when these context-based representations are used instead of word-based representations.

[1]  Davide Buscaldi,et al.  LIPN-UAM at EmoInt-2017: Combination of Lexicon-based features and Sentence-level Vector Representations for Emotion Intensity Determination , 2017, WASSA@EMNLP.

[2]  Pushpak Bhattacharyya,et al.  Entity Extraction in Biomedical Corpora: An Approach to Evaluate Word Embedding Features with PSO based Feature Selection , 2017, EACL.

[3]  Constantin Orasan,et al.  Aggressive Language Identification Using Word Embeddings and Sentiment Features , 2018, TRAC@COLING 2018.

[4]  Roland Vollgraf,et al.  Contextual String Embeddings for Sequence Labeling , 2018, COLING.

[5]  Zhihua Zhang,et al.  ECNU: Multi-level Sentiment Analysis on Twitter Using Traditional Linguistic Features and Word Embedding Features , 2015, *SEMEVAL.

[6]  Naoaki Okazaki,et al.  Reducing Lexical Features in Parsing by Word Embeddings , 2015, PACLIC.

[7]  Allyson Ettinger,et al.  Assessing Composition in Sentence Vector Representations , 2018, COLING.

[8]  Yih-Ru Wang,et al.  Word Order Sensitive Embedding Features/Conditional Random Field-based Chinese Grammatical Error Detection , 2016, NLP-TEA@COLING.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Cécile Paris,et al.  Text and Data Mining Techniques in Adverse Drug Reaction Detection , 2015, ACM Comput. Surv..

[11]  Abeed Sarker,et al.  Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features , 2015, J. Am. Medical Informatics Assoc..

[12]  Hans Uszkoreit,et al.  Word Embeddings as Features for Supervised Coreference Resolution , 2017, RANLP.

[13]  Keyuan Jiang,et al.  Construction of a Personal Experience Tweet Corpus for Health Surveillance , 2016, BioNLP@ACL.

[14]  Bo An,et al.  ISCAS_NLP at SemEval-2016 Task 1: Sentence Similarity Based on Support Vector Regression using Multiple Features , 2016, SemEval@NAACL-HLT.

[15]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[16]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[17]  Yoshimasa Tsuruoka,et al.  Task-Oriented Learning of Word Embeddings for Semantic Relation Classification , 2015, CoNLL.

[18]  Mizuki Morita,et al.  Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter , 2011, EMNLP.

[19]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[20]  S. T. Rosenbloom,et al.  A Scalable Framework to Detect Personal Health Mentions on Twitter , 2015, Journal of Medical Internet Research.

[21]  Thomas Demeester,et al.  Representation learning for very short texts using weighted word embedding aggregation , 2016, Pattern Recognit. Lett..

[22]  Mark Dredze,et al.  Separating Fact from Fear: Tracking Flu Infections on Twitter , 2013, NAACL.

[23]  Nan Hua,et al.  Universal Sentence Encoder , 2018, ArXiv.

[24]  Hady W. Lauw,et al.  Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings , 2018, ACL.

[25]  Anastassia Loukina,et al.  Word-Embedding based Content Features for Automated Oral Proficiency Scoring , 2018, SemDeep@COLING.

[26]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[27]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[28]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[29]  Keyuan Jiang,et al.  Detecting Personal Experience Tweets for Health Surveillance Using Unsupervised Feature Learning and Recurrent Neural Networks , 2018, AAAI Workshops.

[30]  Anthony N. Nguyen,et al.  The Benefits of Word Embeddings Features for Active Learning in Clinical Information Extraction , 2016, ALTA.

[31]  Pushpak Bhattacharyya,et al.  Are Word Embedding-based Features Useful for Sarcasm Detection? , 2016, EMNLP.

[32]  Betsy L. Humphreys,et al.  Relationships in Medical Subject Headings (MeSH) , 2001 .

[33]  Robert Power,et al.  Social Media Monitoring for Health Indicators , 2015 .