Sentiment Classification into Three Classes Applying Multinomial Bayes Algorithm, N-Grams, and Thesaurus

The paper is devoted to development of the method that classifies texts in English and Russian by sentiments into positive, negative, and neutral. The proposed method is based on the Multinomial Naive Bayes classifier with additional n-grams application. The classifier is trained either on three classes, or on two contrasting classes with a threshold to separate neutral texts. Experiments with texts on various topics showed significant improvement of classification quality for reviews from a particular domain. Besides, the analysis of thesaurus relationships application to sentiment classification into three classes was done, however it did not show significant improvement of the classification results.

[1]  Preslav Nakov,et al.  SemEval-2016 Task 4: Sentiment Analysis in Twitter , 2016, *SEMEVAL.

[2]  Elena Tutubalina,et al.  SentiRuEval: testing object-oriented sentiment analysis systems in Russian , 2015 .

[3]  B. S. Harish,et al.  Classification of Short Text Using Various Preprocessing Techniques: An Empirical Evaluation , 2018 .

[4]  Jeffrey Nichols,et al.  GOAALLL!: Using sentiment in the World Cup to explore theories of emotion , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[5]  Shrikanth S. Narayanan,et al.  Tweester at SemEval-2017 Task 4: Fusion of Semantic-Affective and pairwise classification models for sentiment analysis in Twitter , 2017, *SEMEVAL.

[6]  Ksenia Lagutina,et al.  Sentiment classification of long newspaper articles based on automatically generated thesaurus with various semantic relationships , 2017, 2017 21st Conference of Open Innovations Association (FRUCT).

[7]  Saif Mohammad,et al.  NRC-Canada-2014: Recent Improvements in the Sentiment Analysis of Tweets , 2014, SemEval@COLING.

[8]  David Yarowsky,et al.  Exploring Demographic Language Variations to Improve Multilingual Sentiment Analysis in Social Media , 2013, EMNLP.

[9]  Xiaohui Yu,et al.  ARSA: a sentiment-aware model for predicting sales performance using blogs , 2007, SIGIR.

[10]  Roliana Ibrahim,et al.  Ordinal-based and frequency-based integration of feature selection methods for sentiment analysis , 2017, Expert Syst. Appl..

[11]  Keenen Cates,et al.  Can Emoticons Be Used to Predict Sentiment? , 2021, Journal of Data Science.

[12]  Santanu Kumar Rath,et al.  Classification of sentiment reviews using n-gram machine learning approach , 2016, Expert Syst. Appl..

[13]  Usman Qamar,et al.  eSAP: A decision support framework for enhanced sentiment analysis and polarity classification , 2016, Inf. Sci..

[14]  Danny Chiang Choon Poo,et al.  Aspect-Based Twitter Sentiment Classification , 2013, 2013 IEEE 25th International Conference on Tools with Artificial Intelligence.

[15]  Ksenia Lagutina,et al.  Sentiment Classification of Russian Texts Using Automatically Generated Thesaurus , 2018, 2018 23rd Conference of Open Innovations Association (FRUCT).

[16]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[17]  Antonio Moreno,et al.  SiTAKA at SemEval-2017 Task 4: Sentiment Analysis in Twitter Based on a Rich Set of Features , 2017, *SEMEVAL.

[18]  Yue Zhang,et al.  Target-Dependent Twitter Sentiment Classification with Rich Automatic Features , 2015, IJCAI.

[19]  Aqil M. Azmi,et al.  Arabic tweets sentiment analysis – a hybrid scheme , 2016, J. Inf. Sci..

[20]  Natalia V. Loukachevitch,et al.  Объектно-ориентированный анализ твитов по тональности: результаты и проблемы (Entity-Oriented Sentiment Analysis of Tweets: Results and Problems) , 2015, DAMDID/RCDL.

[21]  Matthias Hagen,et al.  Webis: An Ensemble for Twitter Sentiment Detection , 2015, *SEMEVAL.

[22]  Rajwinder Kaur,et al.  Sentiment Analysis of Movie Reviews: A Study of Machine Learning Algorithms with Various Feature Selection Methods , 2017 .

[23]  Kim Schouten,et al.  Survey on Aspect-Level Sentiment Analysis , 2016, IEEE Transactions on Knowledge and Data Engineering.