TopicPie: An Interactive Visualization for LDA-Based Topic Analysis

LDA-based topic analysis is widely used in text mining field. Considering the large scale of web documents, document clusters are usually analyzed instead of single ones. However, the existing visualizations of LDA-based clustering do not intuitively present contents of hot topics while maintaining the relationships between the topics and the document clusters. In this paper, we propose an integrated interactive visualization method that provides intuitive and effective views for topic popularity, topic contents, document clusters, and relationships between topics and document clusters. In this way, users can quickly identify the topic-based patterns. We show an experimental evaluation by comparing the tabular representation and our visualization. The results show that our method can significantly facilitate the topic analysis, particularly in the field of Chinese culture study.

[1]  S Kullback,et al.  LETTER TO THE EDITOR: THE KULLBACK-LEIBLER DISTANCE , 1987 .

[2]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[4]  Maarten Marx,et al.  Summarization of meetings using word clouds , 2011, 2011 CSI International Symposium on Computer Science and Software Engineering (CSSE).

[5]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6]  Jimeng Sun,et al.  SolarMap: Multifaceted Visual Analytics for Topic Exploration , 2011, 2011 IEEE 11th International Conference on Data Mining.

[7]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[8]  Michael Gleicher,et al.  Task-Driven Comparison of Topic Models , 2016, IEEE Transactions on Visualization and Computer Graphics.

[9]  Andrew McCallum,et al.  Efficient methods for topic model inference on streaming document collections , 2009, KDD.

[10]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[11]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[12]  Michael Gleicher,et al.  Serendip: Topic model-driven visual exploration of text corpora , 2014, 2014 IEEE Conference on Visual Analytics Science and Technology (VAST).

[13]  Michael Burch,et al.  Concentri Cloud: Word Cloud Visualization for Multiple Text Documents , 2015, 2015 19th International Conference on Information Visualisation.