论文信息 - Cross-lingual Twitter Polarity Detection via Projection across Word-Aligned Corpora

Cross-lingual Twitter Polarity Detection via Projection across Word-Aligned Corpora

In this paper, we propose an unsupervised framework that leverages the sentiment resources and tools available in English language to automatically generate stand-alone polarity lexicons and classifiers for languages with scarce subjectivity resources and thus avoids the need for labor intensive manual annotation. Starting with a list of English sentiment-bearing words, we expand this lexicon using WordNet synsets. For each sentence pair in a given bilingual parallel corpus, the highprecision English polarity lexicon is applied to the English side then the output sentiment label is projected onto the target language side via statistically derived word alignments. The resulting lexicon is applied to a large pool of unlabeled tweets in the target language, in order to automatically label tweets as training data to train polarity classifier. Our experiments using Spanish and Portuguese as target ones have shown that the resulting classifiers help to improve polarity classification performance compared to lexicon-based classification for under-resourced languages in social media.

Riham Mansour | Mohamed Abdel-Hady | Ahmed Ashour

[1] Owen Rambow,et al. Sentiment Analysis of Twitter Data , 2011 .

[2] Bo Pang,et al. Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[3] Vasudeva Varma,et al. Hindi Subjective Lexicon: A Lexical Resource for Hindi Adjective Polarity Classification , 2012, LREC.

[4] Ari Rappoport,et al. Enhanced Sentiment Learning Using Twitter Hashtags and Smileys , 2010, COLING.

[5] Rada Mihalcea,et al. Learning Multilingual Subjective Language via Cross-Lingual Projections , 2007, ACL.

[6] Xiaodong He. Using Word-Dependent Transition Models in HMM-Based Word Alignment for Statistical Machine Translation , 2007, WMT@ACL.

[7] Vaibhavi N Patodkar,et al. Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2016 .

[8] Erik Cambria,et al. SenticNet 3: A Common and Common-Sense Knowledge Base for Cognition-Driven Sentiment Analysis , 2014, AAAI.

[9] Xiaojun Wan,et al. Co-Training for Cross-Lingual Sentiment Classification , 2009, ACL.

[10] Saif Mohammad,et al. NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[11] Timothy Baldwin,et al. Lexical Normalisation of Short Text Messages: Makn Sens a #twitter , 2011, ACL.