论文信息 - Sentiment analysis system adaptation for multilingual processing: The case of tweets

Sentiment analysis system adaptation for multilingual processing: The case of tweets

We study different strategies to classify sentiment from tweets, using supervised learning with hybrid features.We experiment with English and Spanish data and compare against benchmark competitions.We employ machine-translated data from other languages for training.We show that the use of multilingual data improves the sentiment classification accuracy. Nowadays opinion mining systems play a strategic role in different areas such as Marketing, Decision Support Systems or Policy Support. Since the arrival of the Web 2.0, more and more textual documents containing information that express opinions or comments in different languages are available. Given the proven importance of such documents, the use of effective multilingual opinion mining systems has become of high importance to different fields. This paper presents the experiments carried out with the objective to develop a multilingual sentiment analysis system. We present initial evaluations of methods and resources performed in two international evaluation campaigns for English and for Spanish. After our participation in both competitions, additional experiments were carried out with the aim of improving the performance of both Spanish and English systems by using multilingual machine-translated data. Based on our evaluations, we show that the use of hybrid features and multilingual, machine-translated data (even from other languages) can help to better distinguish relevant features for sentiment classification and thus increase the precision of sentiment analysis systems.

José Manuel Perea Ortega | Alexandra Balahur | A. Balahur

[1] Preslav Nakov,et al. SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[2] Timothy O'Keefe. Feature Selection and Weighting Methods in Sentiment Analysis , 2009 .

[3] T. Landauer,et al. Indexing by Latent Semantic Analysis , 1990 .

[5] Xiaojun Wan,et al. Co-Training for Cross-Lingual Sentiment Classification , 2009, ACL.

[6] Josef Steinberger,et al. Multilingual Entity-Centered Sentiment Analysis Evaluated by Parallel Corpora , 2011, RANLP.

[7] J. Platt. Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[8] Vaibhavi N Patodkar,et al. Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2016 .

[9] Alexandra Balahur,et al. Multilingual Sentiment Analysis using Machine Translation? , 2012, WASSA@ACL.

[10] Lei Zhang,et al. Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[11] Bo Pang,et al. Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[12] Alexandra Balahur,et al. OPTWIMA: Comparing Knowledge-rich and Knowledge-poor Approaches for Sentiment Analysis in Short Informal Texts , 2013, *SEMEVAL.