WordBridge: Using Composite Tag Clouds in Node-Link Diagrams for Visualizing Content and Relations in Text Corpora

We introduce WordBridge, a novel graph-based visualization technique for showing relationships between entities in text corpora. The technique is a node-link visualization where both nodes and links are tag clouds. Using these tag clouds, WordBridge can reveal relationships by representing not only entities and their connections, but also the nature of their relationship using representative keywords for nodes and edges. In this paper, we apply the technique to an interactive web-based visual analytics environment---Apropos---where a user can explore a text corpus using WordBridge. We validate the technique using several case studies based on document collections such as intelligence reports, co-authorship networks, and works of fiction.

[1]  Mandalay Grems A survey of languages and systems for information retrieval , 1962, CACM.

[2]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[3]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[4]  Peter Eades,et al.  A Heuristic for Graph Drawing , 1984 .

[5]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[6]  James J. Thomas,et al.  Visualizing the non-visual: spatial analysis and interaction with information from text documents , 1995, Proceedings of Visualization 1995 Conference.

[7]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[8]  Kyo Kageura,et al.  METHODS OF AUTOMATIC TERM RECOGNITION : A REVIEW , 1996 .

[9]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[10]  A. K. Pujari,et al.  Data Mining Techniques , 2006 .

[11]  Hsinchun Chen,et al.  Extracting Meaningful Entities from Police Narrative Reports , 2002, DG.O.

[12]  Martin Wattenberg,et al.  Arc diagrams: visualizing structure in strings , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[13]  Peter Jackson,et al.  Natural Language Processing for Online Applications: Text Retrieval, Extraction & Categorization , 2002 .

[14]  Lucy T. Nowell,et al.  ThemeRiver: Visualizing Thematic Changes in Large Document Collections , 2002, IEEE Trans. Vis. Comput. Graph..

[15]  Peter Jackson,et al.  Natural language processing for online applications : text retrieval, extraction and categorization , 2002 .

[16]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[17]  P. Kantor Foundations of Statistical Natural Language Processing , 2001, Information Retrieval.

[18]  Andreas Hotho,et al.  A Brief Survey of Text Mining , 2005, LDV Forum.

[19]  Pak Chung Wong,et al.  Dynamic visualization of graphs with extended labels , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[20]  Andrew A. Kennings,et al.  Force-Directed Methods for Generic Placement , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[21]  Fernanda B. Viégas,et al.  Visualizing email content: portraying relationships from conversational histories , 2006, CHI.

[22]  William Ribarsky,et al.  NewsLab: Exploratory Broadcast News Video Analysis , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[23]  Alan J. Dix,et al.  A Taxonomy of Clutter Reduction for Information Visualisation , 2007, IEEE Transactions on Visualization and Computer Graphics.

[24]  Ben Shneiderman,et al.  Discovering interesting usage patterns in text collections: integrating text mining with visualization , 2007, CIKM '07.

[25]  Carl Gutwin,et al.  Seeing things in the clouds: the effect of visual features on tag cloud selections , 2008, Hypertext.

[26]  Martin Wattenberg,et al.  The Word Tree, an Interactive Visual Concordance , 2008, IEEE Transactions on Visualization and Computer Graphics.

[27]  Martin Wattenberg,et al.  TIMELINESTag clouds and the case for vernacular visualization , 2008, INTR.

[28]  Catherine Plaisant,et al.  What's being said near “Martha”? Exploring name entities in literary text collections , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[29]  M. Sheelagh T. Carpendale,et al.  DocuBurst: Visualizing Document Content using Language Structure , 2009, Comput. Graph. Forum.

[30]  Martin Wattenberg,et al.  Mapping Text with Phrase Nets , 2009, IEEE Transactions on Visualization and Computer Graphics.

[31]  Martin Wattenberg,et al.  Parallel Tag Clouds to explore and analyze faceted text corpora , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[32]  Martin Wattenberg,et al.  Participatory Visualization with Wordle , 2009, IEEE Transactions on Visualization and Computer Graphics.

[33]  Daniel A. Keim,et al.  Document Cards: A Top Trumps Visualization for Documents , 2009, IEEE Transactions on Visualization and Computer Graphics.