An Interactive System for Visual Analytics of Dynamic Topic Models

The vast amount and rapid growth of data on the Web and in document repositories make knowledge extraction and trend analysis a challenging task. A well-proven approach for the unsupervised analysis of large text corpora is dynamic topic modeling. While there is a solid body of research on fundamentals and applications of this technique, visual-interactive analysis systems for allowing end-users to perform analysis tasks using topic models are still rare. In this paper, we present D-VITA, an interactive text analysis system that exploits dynamic topic modeling to detect the latent topic structure and dynamics in a collection of documents. D-VITA supports end-users in understanding and exploiting the topic modeling results by providing interactive visualizations of the topic evolution in document collections and by browsing documents based on keyword search and similarity of their topic distributions. The system was evaluated by a scientific community that used D-VITA for trend analysis in their data sources. The results indicate high usability of D-VITA and its usefulness for productive analysis tasks.

[1]  Chong Wang,et al.  Continuous Time Dynamic Topic Models , 2008, UAI.

[2]  Davide Taibi,et al.  Fostering Analytics on Learning Analytics Research: the LAK Dataset , 2013, LAK.

[3]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[4]  Flemming Topsøe,et al.  Jensen-Shannon divergence and Hilbert space embedding , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[5]  Carl Lagoze,et al.  The web of topics: discovering the topology of topic evolution in a corpus , 2011, WWW.

[6]  Sandro Mendonça,et al.  The strategic strength of weak signal analysis , 2012 .

[7]  Nikou Günnemann,et al.  D-VITA: A Visual Interactive Text Analysis System Using Dynamic Topic Mining , 2013, BTW Workshops.

[8]  Ling Liu,et al.  Encyclopedia of Database Systems , 2009, Encyclopedia of Database Systems.

[9]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[10]  Eric P. Xing,et al.  Timeline: A Dynamic Hierarchical Dirichlet Process Model for Recovering Birth/Death and Evolution of Topics in Text Stream , 2010, UAI.

[11]  David B. Dunson,et al.  Probabilistic topic models , 2011, KDD '11 Tutorials.

[12]  Jian Pei,et al.  Detecting topic evolution in scientific literature: how can citations help? , 2009, CIKM.

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  Graham Cormode,et al.  Exponentially Decayed Aggregates on Data Streams , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[15]  David M. Blei,et al.  Visualizing Topic Models , 2012, ICWSM.

[16]  ChengXiang Zhai,et al.  Discovering evolutionary theme patterns from text: an exploration of temporal text mining , 2005, KDD '05.

[17]  Shimei Pan,et al.  TIARA: Interactive, Topic-Based Visual Text Summarization and Analysis , 2012, TIST.

[18]  Lucy T. Nowell,et al.  ThemeRiver: visualizing theme changes over time , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[19]  Daniel A. Keim,et al.  Visual Analytics: Scope and Challenges , 2008, Visual Data Mining.

[20]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[21]  Edith Cohen Decay Models , 2009, Encyclopedia of Database Systems.

[22]  Pak Chung Wong,et al.  Guest Editor's Introduction: Visual Data Mining , 1999, IEEE Computer Graphics and Applications.

[23]  Brian D. Davison,et al.  Tracking trends: incorporating term volume into temporal topic models , 2011, KDD.