Evaluating Visual Representations for Topic Understanding and Their Effects on Manually Generated Topic Labels

Probabilistic topic models are important tools for indexing, summarizing, and analyzing large document collections by their themes. However, promoting end-user understanding of topics remains an open research problem. We compare labels generated by users given four topic visualization techniques— word lists, word lists with bars, word clouds, and network graphs—against each other and against automatically generated labels. Our basis of comparison is participant ratings of how well labels describe documents from the topic. Our study has two phases: a labeling phase where participants label visualized topics and a validation phase where different participants select which labels best describe the topics’ documents. Although all visualizations produce similar quality labels, simple visualizations such as word lists allow participants to quickly understand topics, while complex visualizations take longer but expose multi-word expressions that simpler visualizations obscure. Automatic labels lag behind user-created labels, but our dataset of manually labeled topics highlights linguistic patterns (e.g., hypernyms, phrases) that can be used to improve automatic topic labeling algorithms. 1 Comprehensible Topic Models Needed A central challenge of the “big data” era is to help users make sense of large text collections (Hotho et al., 2005). A common approach to summarizing the main themes in a corpus is to use topic models (Blei, 2012), which are data-driven statistical models that identify words that appear together in similar documents. These sets of words or “topics” evince internal coherence and can help guide users to relevant documents. For instance, an FBI investigator sifting through the released Hillary Clinton e-mails may see a topic with the words “Benghazi”, “Libya”, “Blumenthal”, and “success”, spurring the investigator to dig deeper to find further evidence of inappropriate communication with longtime friend Sidney Blumenthal regarding Benghazi. A key challenge for topic modeling, however, is how to promote end-user understanding of individual topics and the overall model. Most existing topic presentations use simple word lists (Chaney and Blei, 2012; Eisenstein et al., 2012). Although a variety of alternative topic visualization techniques exist (Sievert and Shirley, 2014; Yi et al., 2005), there has been no systematic assessment to compare them. Beyond exploring different visualization techniques, another means of making topics easier for users to understand is to provide descriptive labels to complement a topic’s set of words (Aletras et al., 2014). Unfortunately, manual labeling is slow and, while automatic labeling approaches exist (Lau et al., 2010; Mei et al., 2007; Lau et al., 2011), their effectiveness is not guaranteed for all tasks. To better understand these problems, we use labeling to evaluate topic model visualizations. Our study compares the impact of four commonly used topic visualization techniques on the labels that users create when interpreting a topic (Figure 1): word lists, word lists with bars, word clouds, and network graphs. On Amazon Mechanical Turk, one set of users viewed a series of individual topic vi-

[1]  Jacob Eisenstein,et al.  Exploratory Thematic Analysis for Digitized Archival Collections , 2015, Digit. Scholarsh. Humanit..

[2]  Timothy Baldwin,et al.  Representing topics labels for exploring digital libraries , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[3]  Stephen G. Kobourov,et al.  Experimental Comparison of Semantic Word Clouds , 2014, SEA.

[4]  Mark Stevenson,et al.  Labelling Topics using Unsupervised Graph-based Methods , 2014, ACL.

[5]  Jordan Boyd-Graber,et al.  Concurrent Visualization of Relationships between Words and Topics in Topic Models , 2014 .

[6]  Kenneth E. Shirley,et al.  LDAvis: A method for visualizing and interpreting topics , 2014 .

[7]  Timothy Baldwin,et al.  Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality , 2014, EACL.

[8]  Philip Resnik,et al.  Argviz: Interactive Visualization of Topic Dynamics in Multi-party Conversations , 2013, NAACL.

[9]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[10]  Jeffrey Heer,et al.  Termite: visualization techniques for assessing textual topic models , 2012, AVI.

[11]  David M. Blei,et al.  Visualizing Topic Models , 2012, ICWSM.

[12]  Aniket Kittur,et al.  TopicViz: interactive topic exploration in document collections , 2012, CHI Extended Abstracts.

[13]  David B. Dunson,et al.  Probabilistic topic models , 2011, KDD '11 Tutorials.

[14]  Ben Shneiderman,et al.  Group-in-a-Box Layout for Multi-faceted Analysis of Communities , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[15]  Timothy Baldwin,et al.  Automatic Labelling of Topic Models , 2011, ACL.

[16]  Alexander J. Smola,et al.  An architecture for parallel topic models , 2010, Proc. VLDB Endow..

[17]  Timothy Baldwin,et al.  Best Topic Word Selection for Topic Labelling , 2010, COLING.

[18]  Timothy Baldwin,et al.  Evaluating topic models for digital libraries , 2010, JCDL '10.

[19]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[20]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[21]  Andrew McCallum,et al.  Rethinking LDA: Why Priors Matter , 2009, NIPS.

[22]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[23]  Andrew McCallum,et al.  Efficient methods for topic model inference on streaming document collections , 2009, KDD.

[24]  ChengXiang Zhai,et al.  Automatic labeling of multinomial topic models , 2007, KDD '07.

[25]  P. Riehmann,et al.  Interactive Sankey diagrams , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[26]  John T. Stasko,et al.  Dust & Magnet: Multivariate Information Visualization Using a Magnet Metaphor , 2005, Inf. Vis..

[27]  G. Paass,et al.  A Brief Survey of Text Mining , 2005, LDV Forum.

[28]  Ramanathan V. Guha,et al.  TAP: a Semantic Web platform , 2003, Comput. Networks.

[29]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[30]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[31]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[32]  Ben Shneiderman,et al.  Tree visualization with tree-maps: 2-d space-filling approach , 1992, TOGS.

[33]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[34]  F. Jelinek,et al.  Perplexity—a measure of the difficulty of speech recognition tasks , 1977 .

[35]  Ben Shneiderman,et al.  Visual Analysis of Topical Evolution in Unstructured Text: Design and Evaluation of TopicFlow , 2015, Applications of Social Media and Social Network Analysis.

[36]  Jordan L. Boyd-Graber,et al.  Interactive Topic Modeling , 2011, ACL.

[37]  Matt Gardner The Topic Browser An Interactive Tool for Browsing Topic Models , 2010 .

[38]  Jon M. Kleinberg,et al.  An Impossibility Theorem for Clustering , 2002, NIPS.

[39]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.