SocialVisTUM: An Interactive Visualization Toolkit for Correlated Neural Topic Models on Social Media Opinion Mining

Recent research in opinion mining proposed word embedding-based topic modeling methods that provide superior coherence compared to traditional topic modeling. In this paper, we demonstrate how these methods can be used to display correlated topic models on social media texts using SocialVisTUM, our proposed interactive visualization toolkit. It displays a graph with topics as nodes and their correlations as edges. Further details are displayed interactively to support the exploration of large text collections, e.g., representative words and sentences of topics, topic and sentiment distributions, hierarchical topic clustering, and customizable, predefined topic labels. The toolkit optimizes automatically on custom data for optimal coherence. We show a working instance of the toolkit on data crawled from English social media discussions about organic food consumption. The visualization confirms findings of a qualitative consumer research study. SocialVisTUM and its training procedures are accessible online.

[1]  Michael W. Link,et al.  Social Media in Public Opinion Research Executive Summary of the Aapor Task Force on Emerging Technologies in Public Opinion Research , 2014 .

[2]  Yan Song,et al.  Unsupervised Neural Aspect Extraction with Sememes , 2019, IJCAI.

[3]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[4]  Luis Gravano,et al.  Leveraging Just a Few Keywords for Fine-Grained Aspect Detection Through Weakly Supervised Co-Training , 2019, EMNLP.

[5]  L. Menapace,et al.  Using online comments to explore consumer beliefs regarding organic food in German-speaking countries and the United States , 2020 .

[6]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[7]  Ramesh Nallapati,et al.  Coherence-Aware Neural Topic Modeling , 2018, EMNLP.

[8]  Xinbing Wang,et al.  Text Network Exploration via Heterogeneous Web of Topics , 2016, 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW).

[9]  Robert M. Rolfe,et al.  Topic similarity networks: Visual analytics for large document sets , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[10]  David Buttler,et al.  Exploring Topic Coherence over Many Models and Many Topics , 2012, EMNLP.

[11]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12]  Arjun Mukherjee,et al.  Aspect Extraction with Automated Prior Knowledge Learning , 2014, ACL.

[13]  Gerlof Bouma,et al.  Normalized (pointwise) mutual information in collocation extraction , 2009 .

[14]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[15]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[16]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[17]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[18]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[19]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[20]  Hannah Danner,et al.  Combining content analysis and neural networks to analyze discussion topics in online comments about organic food , 2020 .

[21]  Hwee Tou Ng,et al.  An Unsupervised Neural Attention Model for Aspect Extraction , 2017, ACL.

[22]  Mirella Lapata,et al.  Summarizing Opinions: Aspect Extraction Meets Sentiment Prediction and They Are Both Weakly Supervised , 2018, EMNLP.

[23]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[24]  Andrew McCallum,et al.  Optimizing Semantic Coherence in Topic Models , 2011, EMNLP.

[25]  Baining Guo,et al.  TopicPanorama: A Full Picture of Relevant Topics , 2014, IEEE Transactions on Visualization and Computer Graphics.