A novel approach to sentiment analysis in Persian using discourse and external semantic information

Sentiment analysis attempts to identify, extract and quantify affective states and subjective information from various types of data such as text, audio, and video. Many approaches have been proposed to extract the sentiment of individuals from documents written in natural languages in recent years. The majority of these approaches have focused on English, while resource-lean languages such as Persian suffer from the lack of research work and language resources. Due to this gap in Persian, the current work is accomplished to introduce new methods for sentiment analysis which have been applied on Persian. The proposed approach in this paper is two-fold: The first one is based on classifier combination, and the second one is based on deep neural networks which benefits from word embedding vectors. Both approaches takes advantage of local discourse information and external knowledge bases, and also cover several language issues such as negation and intensification, andaddresses different granularity levels, namely word, aspect, sentence, phrase and document-levels. To evaluate the performance of the proposed approach, a Persian dataset is collected from Persian hotel reviews referred as hotel reviews. The proposed approach has been compared to counterpart methods based on the benchmark dataset. The experimental results approve the effectiveness of the proposed approach when compared to related works.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[3]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[4]  Sumbal Riaz,et al.  Opinion mining on large scale data using sentiment analysis and k-means clustering , 2019, Cluster Computing.

[5]  Gholamreza Ghassem-Sani,et al.  LexiPers: An ontology based sentiment lexicon for Persian , 2016, GCAI.

[6]  Daniel Zappala,et al.  Analyzing the Political Sentiment of Tweets in Farsi , 2016, ICWSM.

[7]  Mohammad Bagher Dastgheib,et al.  The application of Deep Learning in Persian Documents Sentiment Analysis , 2020 .

[8]  T. V. Prabhakar,et al.  Sentence Level Sentiment Analysis in the Presence of Conjuncts Using Linguistic Analysis , 2007, ECIR.

[9]  Hima Suresh,et al.  An unsupervised fuzzy clustering method for twitter sentiment analysis , 2016, 2016 International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS).

[10]  Steven Skiena,et al.  POLYGLOT-NER: Massive Multilingual Named Entity Recognition , 2014, SDM.

[11]  LiuYang,et al.  A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algorithm , 2017 .

[12]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[13]  Sahar A. El Rahman,et al.  Sentiment Analysis of Twitter Data , 2019, 2019 International Conference on Computer and Information Sciences (ICCIS).

[14]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[15]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[16]  Jingpeng Li,et al.  A Hybrid Persian Sentiment Analysis Framework: Integrating Dependency Grammar Based Rules and Deep Neural Networks , 2019, Neurocomputing.

[17]  James H. Martin,et al.  Speech and Language Processing, 2nd Edition , 2008 .

[18]  Ayoub Bagheri,et al.  Feature Selection Methods in Persian Sentiment Analysis , 2013, NLDB.

[19]  Arafat Awajan,et al.  Deep Learning Based Technique for Plagiarism Detection in Arabic Texts , 2017, 2017 International Conference on New Trends in Computing Sciences (ICTCS).

[20]  Andrew Gordon Wilson,et al.  Probabilistic FastText for Multi-Sense Word Embeddings , 2018, ACL.

[21]  Ayoub Bagheri,et al.  Sentiment classification in Persian: Introducing a mutual information-based method for feature selection , 2013, 2013 21st Iranian Conference on Electrical Engineering (ICEE).

[22]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[23]  Weitong Chen,et al.  A survey of sentiment analysis in social media , 2018, Knowledge and Information Systems.

[24]  Adel Rahimi,et al.  A rule based algorithm for detecting negative words in Persian , 2017, ArXiv.

[25]  Namita Mittal,et al.  Concept-Level Sentiment Analysis with Dependency-Based Semantic Parsing: A Novel Approach , 2015, Cognitive Computation.

[26]  Kemal Oflazer,et al.  SentiTurkNet: a Turkish polarity lexicon for sentiment analysis , 2016, Lang. Resour. Evaluation.

[27]  Fatemeh Amiri,et al.  Lexicon-based Sentiment Analysis for Persian Text , 2015, RANLP.

[28]  Zahra Bokaee Nezhad,et al.  A COMBINED DEEP LEARNING MODEL FOR PERSIAN SENTIMENT ANALYSIS , 2019, IIUM Engineering Journal.

[29]  Janyce Wiebe,et al.  Articles: Recognizing Contextual Polarity: An Exploration of Features for Phrase-Level Sentiment Analysis , 2009, CL.

[30]  Hassan Maleki,et al.  SentiPers: A Sentiment Analysis Corpus for Persian , 2018, ArXiv.

[31]  H. Emami A Semantic Approach to Person Profile Extraction from Farsi Documents , 2017 .

[32]  Burairah Hussin,et al.  Opinion Mining of Movie Review using Hybrid Method of Support Vector Machine and Particle Swarm Optimization , 2013 .

[33]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[34]  Mohammad Rahmati,et al.  Sentiment analysis using deep learning on Persian texts , 2017, 2017 Iranian Conference on Electrical Engineering (ICEE).

[35]  Yang Liu,et al.  A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algorithm , 2017, Inf. Sci..

[36]  Peter D. Turney Thumbs Up, Thumbs Down , 2013, Journal of Cell Science.

[37]  François Chollet,et al.  Keras: The Python Deep Learning library , 2018 .

[38]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[39]  Moloud Abdar,et al.  The effect of aggregation methods on sentiment classification in Persian reviews , 2020, Enterp. Inf. Syst..

[40]  Kalina Bontcheva,et al.  GATE: an Architecture for Development of Robust HLT applications , 2002, ACL.

[41]  Qiang Zhou,et al.  PerSent: A Freely Available Persian Sentiment Lexicon , 2016, BICS.

[42]  P. Waila,et al.  Sentiment analysis of movie reviews: A new feature-based heuristic for aspect-level sentiment classification , 2013, 2013 International Mutli-Conference on Automation, Computing, Communication, Control and Compressed Sensing (iMac4s).

[43]  Recognizing Anaphora Reference in Persian Sentences , 2011 .

[44]  Hadi Larijani,et al.  Exploiting Deep Learning for Persian Sentiment Analysis , 2018, BICS.

[45]  Selma Ayse Özel,et al.  QER: a new feature selection method for sentiment analysis , 2018, Human-centric Computing and Information Sciences.

[46]  Jalal Rezaeenour,et al.  Feature extraction in opinion mining through Persian reviews , 2015 .