Topic-level sentiment analysis of social media data using deep learning

Abstract Due to the inception of Web 2.0 and freedom to facilitate the dissemination of information, sharing views, expressing opinions with regards to current world level events, services, products, etc. social media platforms have been mainly contributing to user-generated content. Such social media data consist of various themes discussed online and are associated with sentiments of the users. To catch up with the speed of streaming data at which it generates on social media platforms, it is crucial to detect the topics being discussed on social media platforms and analyze the sentiments of users towards those topics in an online manner to make timely decisions. Motivated by the same, this paper proposes a deep learning based topic-level sentiment analysis model. The novelty of the proposed approach is that it works at the sentence level to extract the topic using online latent semantic indexing with regularization constraint and then applies topic-level attention mechanism in long short-term memory network to perform sentiment analysis. The proposed model is unique in the sense that it supports scalable and dynamic topic modeling over streaming short text data and performs sentiment analysis at topic-level. For SemEval-2017 Task 4 Subtask B dataset as a case of in-domain topic-level sentiment analysis, average recall of 0.879 has been achieved, whereas, for out-of-domain data, average recall of 0.846, 0.824 and 0.794 has been achieved for newly developed datasets collected under the hashtags #ethereum, #bitcoin and #facebook from Twitter. To assess the performance of the model for scalability, we analyzed the model in terms of average time in milliseconds for creation of feature vectors, throughput in terms of topics detected per second and average response time in seconds to handle the sentiment analysis queries. The experimental results are significant enough to enable large scale topic modeling over streaming data and perform topic-level sentiment analysis.

[1]  Tao Chen,et al.  A Sentence-Level Sparse Gamma Topic Model for Sentiment Analysis , 2018, Canadian Conference on AI.

[2]  Alok N. Choudhary,et al.  Sentiment Analysis of Conditional Sentences , 2009, EMNLP.

[3]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4]  Hua Lu,et al.  A unified model for stable and temporal topic detection from social media data , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[5]  Geoffrey E. Hinton,et al.  Replicated Softmax: an Undirected Topic Model , 2009, NIPS.

[6]  Jon Atle Gulla,et al.  Dynamic Topic-Based Sentiment Analysis of Large-Scale Online News , 2016, WISE.

[7]  Amal Rekik,et al.  Deep Learning for Hot Topic Extraction from Social Streams , 2016, HIS.

[8]  Nizar Habash,et al.  OMAM at SemEval-2017 Task 4: Evaluation of English State-of-the-Art Sentiment Analysis Models for Arabic and a New Topic-based Model , 2017, *SEMEVAL.

[9]  Ngo Van Linh,et al.  Eliminating overfitting of probabilistic topic models on short and noisy text: The role of dropout , 2019, Int. J. Approx. Reason..

[10]  Yunwen Zhu,et al.  Transfer Correlation Between Textual Content to Images for Sentiment Analysis , 2020, IEEE Access.

[11]  Chun Yang,et al.  Semantic indexing with deep learning: a case study , 2016 .

[12]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[13]  Guangyou Zhou,et al.  Topic enhanced deep structured semantic models for knowledge base question answering , 2017, Science China Information Sciences.

[14]  Dinesh Kumar Vishwakarma,et al.  A deep learning architecture of RA-DLNet for visual sentiment analysis , 2020, Multimedia Systems.

[15]  Li Zhao,et al.  Attention-based LSTM for Aspect-level Sentiment Classification , 2016, EMNLP.

[16]  Laizhong Cui,et al.  Weakly supervised topic sentiment joint model with word embeddings , 2018, Knowl. Based Syst..

[17]  Noémie Elhadad,et al.  An Unsupervised Aspect-Sentiment Model for Online Reviews , 2010, NAACL.

[18]  Daniel Barbará,et al.  On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[19]  Dinesh Kumar Vishwakarma,et al.  Multimodal Sentiment Analysis via RNN variants , 2019, 2019 IEEE International Conference on Big Data, Cloud Computing, Data Science & Engineering (BCD).

[20]  Jamie Haddock,et al.  On Large-Scale Dynamic Topic Modeling with Nonnegative CP Tensor Decomposition , 2020, Association for Women in Mathematics Series.

[21]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[22]  Bo Chen,et al.  Deep Autoencoding Topic Model With Scalable Hybrid Bayesian Inference , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Dong-Hong Ji,et al.  A short text sentiment-topic model for product reviews , 2018, Neurocomputing.

[24]  Jiafeng Guo,et al.  BTM: Topic Modeling over Short Texts , 2014, IEEE Transactions on Knowledge and Data Engineering.

[25]  Eugene Agichtein,et al.  TM-LDA: efficient online modeling of latent topic transitions in social media , 2012, KDD.

[26]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[27]  Mathieu Cliche,et al.  BB_twtr at SemEval-2017 Task 4: Twitter Sentiment Analysis with CNNs and LSTMs , 2017, *SEMEVAL.

[28]  Yanchun Zhang,et al.  Incorporating word embeddings into topic modeling of short text , 2018, Knowledge and Information Systems.

[29]  Haoran Xie,et al.  SBTM: Topic Modeling over Short Texts , 2016, DASFAA Workshops.

[30]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[31]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[32]  Alan L. Porter,et al.  Does deep learning help topic extraction? A kernel k-means clustering method with word embedding , 2018, J. Informetrics.

[33]  Hinrich Schütze,et al.  Document Informed Neural Autoregressive Topic Models , 2018, ArXiv.

[34]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[35]  Dinesh Kumar Vishwakarma,et al.  A unified framework of deep networks for genre classification using movie trailer , 2020, Appl. Soft Comput..

[36]  K. Nimala,et al.  A Robust User Sentiment Biterm Topic Mixture Model Based on User Aggregation Strategy to Avoid Data Sparsity for Short Text , 2019, Journal of Medical Systems.

[37]  Siddharth Swarup Rautaray,et al.  Adaptive Framework for Deep Learning based Dynamic and Temporal Topic Modeling from Big Data , 2019 .

[38]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[39]  M. Shahriar Hossain,et al.  Concurrent Inference of Topic Models and Distributed Vector Representations , 2015, ECML/PKDD.

[40]  S. Selva Brunda,et al.  Sentiment analysis by POS and joint sentiment topic features using SVM and ANN , 2018, Soft Comput..

[41]  Christopher S. G. Khoo,et al.  Aspect-based sentiment analysis of movie reviews on discussion boards , 2010, J. Inf. Sci..

[42]  K. Robert Lai,et al.  Refining Word Embeddings for Sentiment Analysis , 2017, EMNLP.

[43]  Kyung Sup Kwak,et al.  Transportation sentiment analysis using word embedding and ontology-based topic modeling , 2019, Knowl. Based Syst..

[44]  Hugo Larochelle,et al.  The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[45]  Klaifer Garcia,et al.  Topic detection and sentiment analysis in Twitter content related to COVID-19 from Brazil and the USA , 2020, Applied Soft Computing.

[46]  Maoquan Wang,et al.  EICA at SemEval-2017 Task 4: A Simple Convolutional Neural Network for Topic-based Sentiment Classification , 2017, SemEval@ACL.

[47]  Reinald Kim Amplayo,et al.  Incorporating product description to sentiment topic models for improved aspect-based sentiment analysis , 2018, Inf. Sci..

[48]  Jing Peng,et al.  An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.

[49]  Guilin Qi,et al.  A New Sentiment and Topic Model for Short Texts on Social Media , 2017, JIST.

[50]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[51]  Bijendra Kumar,et al.  A Hybrid Approach for Requirements Prioritization Using LFPP and ANN , 2019, International Journal of Intelligent Systems and Applications.

[52]  Yunqing Xia,et al.  Adaptive Topic Modeling with Probabilistic Pseudo Feedback in Online Topic Detection , 2010, NLDB.

[53]  Hsinchun Chen,et al.  Deep Learning Based Topic Identification and Categorization: Mining Diabetes-Related Topics on Chinese Health Websites , 2016, DASFAA.

[54]  Yue Wang,et al.  Filtering out the noise in short text topic modeling , 2018, Inf. Sci..

[55]  Siddharth Swarup Rautaray,et al.  Application of Deep Learning Approaches for Sentiment Analysis , 2020 .

[56]  Liang Wang,et al.  WSCNet: Weakly Supervised Coupled Networks for Visual Sentiment Classification and Detection , 2020, IEEE Transactions on Multimedia.

[57]  Aun Irtaza,et al.  Fuzzy topic modeling approach for text mining over short text , 2019, Inf. Process. Manag..

[58]  Mehran Safayani,et al.  Joint sentiment/topic modeling on text data using a boosted restricted Boltzmann Machine , 2019, Multimedia Tools and Applications.

[59]  Hugo Larochelle,et al.  A Neural Autoregressive Topic Model , 2012, NIPS.

[60]  Xiaowei Zhao,et al.  A neural topic model with word vectors and entity vectors for short texts , 2021, Inf. Process. Manag..

[61]  Lei Yang,et al.  Dynamic Online HDP model for discovering evolutionary topics from Chinese social texts , 2016, Neurocomputing.

[62]  José-Ángel González,et al.  ELiRF-UPV at SemEval-2017 Task 4: Sentiment Analysis using Deep Learning , 2017, SemEval@ACL.

[63]  Chunping Li,et al.  Short Text Topic Modeling with Topic Distribution Quantization and Negative Sampling Decoder , 2020, EMNLP.

[64]  John G. Breslin,et al.  INSIGHT-1 at SemEval-2016 Task 5: Deep Learning for Multilingual Aspect-based Sentiment Analysis , 2016, *SEMEVAL.

[65]  M. Ali Akcayol,et al.  A new topic modeling based approach for aspect extraction in aspect based sentiment analysis: SS-LDA , 2020, Expert Syst. Appl..

[66]  Feiran Huang,et al.  Sentiment analysis of social images via hierarchical deep fusion of content and links , 2019, Appl. Soft Comput..

[67]  Daniel Hasan Dalip,et al.  A Majority Voting Approach for Sentiment Analysis in Short Texts using Topic Models , 2017, WebMedia.

[68]  Xiaomo Liu,et al.  funSentiment at SemEval-2017 Task 4: Topic-Based Message Sentiment Classification by Exploiting Word Embeddings, Text Features and Target Contexts , 2017, *SEMEVAL.

[69]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[70]  Hao Zhang,et al.  WHAI: Weibull Hybrid Autoencoding Inference for Deep Topic Modeling , 2018, ICLR.

[71]  Shrikanth S. Narayanan,et al.  Tweester at SemEval-2017 Task 4: Fusion of Semantic-Affective and pairwise classification models for sentiment analysis in Twitter , 2017, *SEMEVAL.

[72]  Feng Nan,et al.  Topic Modeling with Wasserstein Autoencoders , 2019, ACL.

[73]  Yulan He,et al.  TDAM: a Topic-Dependent Attention Model for Sentiment Analysis , 2019, Inf. Process. Manag..

[74]  Fabrizio Sebastiani,et al.  An Axiomatically Derived Measure for the Evaluation of Classification Algorithms , 2015, ICTIR.

[75]  Lei Chen,et al.  Incremental and Adaptive Topic Detection over Social Media , 2018, DASFAA.

[76]  Brian D. Davison,et al.  Empirical study of topic modeling in Twitter , 2010, SOMA '10.

[77]  Zhenlong Li,et al.  Topic modeling and sentiment analysis of global climate change tweets , 2019, Social Network Analysis and Mining.

[78]  Hadi Veisi,et al.  Sentiment analysis based on improved pre-trained word embeddings , 2019, Expert Syst. Appl..

[79]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[80]  Ling Chen,et al.  Hierarchical online NMF for detecting and tracking topic hierarchies in a text stream , 2018, Pattern Recognit..

[81]  Yang Li,et al.  Topical Co-Attention Networks for hashtag recommendation on microblogs , 2019, Neurocomputing.

[82]  Xindong Wu,et al.  Topic Modeling over Short Texts by Incorporating Word Embeddings , 2016, PAKDD.

[83]  Hans-Georg Zimmermann,et al.  Forecasting with Recurrent Neural Networks: 12 Tricks , 2012, Neural Networks: Tricks of the Trade.

[84]  Raj Kumar Gupta,et al.  CrystalNest at SemEval-2017 Task 4: Using Sarcasm Detection for Enhancing Sentiment Classification and Quantification , 2017, *SEMEVAL.

[85]  Bilel Kaddouri,et al.  Domain-Level Topic Detection Approach for Improving Sentiment Analysis in Arabic Content , 2018 .

[86]  Chuchi Montenegro,et al.  Using Latent Dirichlet Allocation for Topic Modeling and Document Clustering of Dumaguete City Twitter Dataset , 2018 .

[87]  Yueting Zhuang,et al.  Jointly Discovering Fine-grained and Coarse-grained Sentiments via Topic Modeling , 2014, ACM Multimedia.

[88]  Yu Hwanjo,et al.  Scalable disk-based topic modeling for memory limited devices , 2020 .

[89]  Dinesh Kumar Vishwakarma,et al.  Sentiment analysis using deep learning architectures: a review , 2019, Artificial Intelligence Review.

[90]  Alon Rozental,et al.  Amobee at SemEval-2017 Task 4: Deep Learning System for Sentiment Detection on Twitter , 2017, SemEval@ACL.

[91]  Jesus Serrano-Guerrero,et al.  A T1OWA and aspect-based model for customizing recommendations on eCommerce , 2020, Appl. Soft Comput..

[92]  Xiaohui Yan,et al.  A Probabilistic Model for Bursty Topic Discovery in Microblogs , 2015, AAAI.

[93]  Nikos Pelekis,et al.  DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis , 2017, *SEMEVAL.

[94]  Xin Zhao,et al.  Modelling user attitudes using hierarchical sentiment-topic model , 2019, Data Knowl. Eng..

[95]  Dong-Hong Ji,et al.  A topic-enhanced word embedding for Twitter sentiment classification , 2016, Inf. Sci..

[96]  Xianghua Fu,et al.  Multi-aspect Blog Sentiment Analysis Based on LDA Topic Model and Hownet Lexicon , 2011, WISM.

[97]  Donghong Ji,et al.  Multidimensional Extra Evidence Mining for Image Sentiment Analysis , 2020, IEEE Access.

[98]  Ting Liu,et al.  Social sentiment sensor: a visualization system for topic detection and topic sentiment analysis on microblog , 2014, Multimedia Tools and Applications.

[99]  Xuejie Zhang,et al.  YNU-HPCC at SemEval 2017 Task 4: Using A Multi-Channel CNN-LSTM Model for Sentiment Classification , 2017, SemEval@ACL.