Leveraging Part-of-Speech Tagging for Sentiment Analysis in Short Texts and Regular Texts

Sentiment analysis has been approached from a spectrum of methodologies, including statistical learning using labelled corpus and rule-based approach where rules may be constructed based on the observations on the lexicons as well as the output from natural language processing tools. In this paper, the experiments to transform labelled datasets by using NLP tools and subsequently performing sentiment analysis via statistical learning algorithms are detailed. In addition to the common data pre-processing prior to sentiment analysis, we represent the tokens in the datasets using Part-Of-Speech (POS) tags. The aim of the experiments is to investigate the impact of POS tags on sentiment analysis, particularly on both short texts and regular texts. The experimental results on short texts show that the combination of adjective and adverb predicts the sentiment of short texts the best. While noun is generally deemed to be neutral in sentiment polarity, the experimental results show that it helps to increase the accuracy of sentiment analysis on regular texts. Besides, the role of negation analysis in the datasets has also been investigated and reported based on the experimental results obtained.

[1]  Wai-Howe Khong,et al.  A COMPARATIVE STUDY OF STATISTICAL AND NATURAL LANGUAGE PROCESSING TECHNIQUES FOR SENTIMENT ANALYSIS , 2015 .

[2]  Hongfei Yan,et al.  Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid , 2010, EMNLP.

[3]  Xiaoyan Zhu,et al.  Sentiment Analysis with Multi-source Product Reviews , 2012, ICIC.

[4]  Chen Gui,et al.  A Rule-Based Approach to Aspect Extraction from Product Reviews , 2014, SocialNLP@COLING.

[5]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[6]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[7]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[8]  Hiroshi Kanayama,et al.  Fully Automatic Lexicon Expansion for Domain-oriented Sentiment Analysis , 2006, EMNLP.

[9]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[10]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[11]  Meichun Hsu,et al.  A Dictionary-Based Approach to Identifying Aspects Implied by Adjectives for Opinion Mining , 2012, COLING.

[12]  Yu Zhang,et al.  Extracting implicit features in online customer reviews for opinion mining , 2013, WWW '13 Companion.

[13]  Noémie Elhadad,et al.  An Unsupervised Aspect-Sentiment Model for Online Reviews , 2010, NAACL.

[14]  David Lo,et al.  A comparative study on the effectiveness of part-of-speech tagging techniques on bug reports , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[15]  Chun Chen,et al.  Opinion Word Expansion and Target Extraction through Double Propagation , 2011, CL.