Sentiment Analysis in Microblogs Using HMMs with Syntactic and Sentimental Information

In this paper, we propose an approach for sentiment analysis in microblogs that learns patterns of syntactic and sentimental word transitions. Because sentences are sequences of words, we can more accurately analyze sentiments by properly modeling the sequential patterns of words in sentimental sentences. However, most previous research has focused on just extending feature sets using n-grams, POS tags, polarity lexicons, etc., without considering sequential patterns. Our proposed approach first identifies groups of words that have similar syntactic and sentimental roles, called SIGs (similar syntactic and sentimental information groups). We then build HMMs using the SIGs as hidden states for the initialization. The SIGs function as the prior knowledge of formative elements of sentimental sentences for HMMs. By using the SIGs, HMMs can start with informative hidden states and more precisely model the transition patterns of words in sentimental sentences with robust probability estimation. For the performance evaluation, we compare the proposed approach with existing ones using HCR dataset. The result shows that the proposed approach outperforms the previous ones in various performance measures.

[1]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[2]  Estevam R. Hruschka,et al.  Tweet sentiment analysis with classifier ensembles , 2014, Decis. Support Syst..

[3]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[4]  Marcelo Mendoza,et al.  Combining strengths, emotions and polarities for boosting Twitter sentiment analysis , 2013, WISDOM '13.

[5]  Harith Alani,et al.  Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new dataset, the STS-Gold , 2013, ESSEM@AI*IA.

[6]  Rainer Schrader,et al.  Sentiment Polarity Classification Using Statistical Data Compression Models , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[7]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[8]  Minyi Guo,et al.  Emoticon Smoothed Language Models for Twitter Sentiment Analysis , 2012, AAAI.

[9]  Harith Alani,et al.  Alleviating Data Sparsity for Twitter Sentiment Analysis , 2012, #MSM.

[10]  Jason Baldridge,et al.  Twitter Polarity Classification with Label Propagation over Lexical Links and the Follower Graph , 2011, ULNLP@EMNLP.

[11]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[12]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[13]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[14]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[15]  Douglas A. Reynolds,et al.  Gaussian Mixture Models , 2018, Encyclopedia of Biometrics.

[16]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[17]  Annie Zaenen,et al.  Contextual Valence Shifters , 2006, Computing Attitude and Affect in Text.

[18]  Claire Cardie,et al.  OpinionFinder: A System for Subjectivity Analysis , 2005, HLT.

[19]  Hiroya Takamura,et al.  Sentiment Classification Using Word Sub-sequences and Dependency Sub-trees , 2005, PAKDD.

[20]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[21]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.