Acr2Vec: Learning Acronym Representations in Twitter

Acronyms are common in Twitter and bring in new challenges to social media analysis. Distributed representations have achieved successful applications in natural language processing. An acronym is different from a single word and is generally defined by several words. To this end, we present Acr2Vec, an algorithmic framework for learning continuous representations for acronyms in Twitter. First, a Twitter ACRonym (TACR) dataset is automatically constructed, in which an acronym is expressed by one or more definitions. Then, three acronym embedding models have been proposed: MPDE (Max Pooling Definition Embedding), APDE (Average Pooling Definition Embedding), and PLAE (Paragraph-Like Acronym Embedding). The qualitative experimental results (i.e., similarity measure) and quantitative experimental results (i.e., acronym polarity classification) both show that MPDE and APDE are superior to PLAE.

[1]  Saif Mohammad,et al.  Sentiment Analysis of Short Informal Texts , 2014, J. Artif. Intell. Res..

[2]  Fuji Ren,et al.  Predicting User-Topic Opinions in Twitter with Social and Topical Context , 2013, IEEE Transactions on Affective Computing.

[3]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[4]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[5]  Amit P. Sheth,et al.  Multimodal social intelligence in a real-time dashboard system , 2010, The VLDB Journal.

[6]  Fangzhao Wu,et al.  Microblog Sentiment Classification with Contextual Knowledge Regularization , 2015, AAAI.

[7]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[8]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[9]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[10]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[11]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[12]  Chao Li,et al.  Acronym Disambiguation Using Word Embedding , 2015, AAAI.

[13]  Tiejun Zhao,et al.  Target-dependent Twitter Sentiment Classification , 2011, ACL.

[14]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.