VarifocalReader — In-Depth Visual Analysis of Large Text Documents

Interactive visualization provides valuable support for exploring, analyzing, and understanding textual documents. Certain tasks, however, require that insights derived from visual abstractions are verified by a human expert perusing the source text. So far, this problem is typically solved by offering overview-detail techniques, which present different views with different levels of abstractions. This often leads to problems with visual continuity. Focus-context techniques, on the other hand, succeed in accentuating interesting subsections of large text documents but are normally not suited for integrating visual abstractions. With VarifocalReader we present a technique that helps to solve some of these approaches' problems by combining characteristics from both. In particular, our method simplifies working with large and potentially complex text documents by simultaneously offering abstract representations of varying detail, based on the inherent structure of the document, and access to the text itself. In addition, VarifocalReader supports intra-document exploration through advanced navigation concepts and facilitates visual analysis tasks. The approach enables users to apply machine learning techniques and search mechanisms as well as to assess and adapt these techniques. This helps to extract entities, concepts and other artifacts from texts. In combination with the automatic generation of intermediate text levels through topic segmentation for thematic orientation, users can test hypotheses or develop interesting new research questions. To illustrate the advantages of our approach, we provide usage examples from literature studies.

[1]  Freddy Y. Y. Choi Advances in domain independent linear text segmentation , 2000, ANLP.

[2]  William Ribarsky,et al.  HierarchicalTopics: Visually Exploring Large Text Collections Using Topic Hierarchies , 2013, IEEE Transactions on Visualization and Computer Graphics.

[3]  Anne E. Trefethen,et al.  Rule‐based Visual Mappings – with a Case Study on Poetry Visualization , 2013, Comput. Graph. Forum.

[4]  Daniel A. Keim,et al.  Mastering the Information Age - Solving Problems with Visual Analytics , 2010 .

[5]  Daniel A. Keim,et al.  Visual readability analysis: How to make your writings easier to read , 2010, IEEE VAST.

[6]  Franco Moretti Graphs, Maps, Trees: Abstract Models for a Literary History , 2005 .

[7]  M. Sheelagh T. Carpendale,et al.  DocuBurst: Visualizing Document Content using Language Structure , 2009, Comput. Graph. Forum.

[8]  Carla E. Brodley,et al.  Dis-function: Learning distance functions interactively , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[9]  Emil Staiger Grundbegriffe der Poetik , 1961 .

[10]  Ramana Rao,et al.  Managing multiple focal levels in Table Lens , 1997, Proceedings of VIZ '97: Visualization Conference, Information Visualization Symposium and Parallel Rendering Symposium.

[11]  Benjamin B. Bederson,et al.  A review of overview+detail, zooming, and focus+context interfaces , 2009, CSUR.

[12]  G. W. Furnas,et al.  Generalized fisheye views , 1986, CHI '86.

[13]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[14]  William Ribarsky,et al.  LeadLine: Interactive visual analysis of text data through event identification and exploration , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[15]  J. B. Kruskal,et al.  Icicle Plots: Better Displays for Hierarchical Clustering , 1983 .

[16]  Jim Thomasa,et al.  Challenges for visual analytics , 2009 .

[17]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[18]  Shimei Pan,et al.  Interactive, topic-based visual text summarization and analysis , 2009, CIKM.

[19]  Ramana Rao,et al.  The table lens: merging graphical and symbolic representations in an interactive focus + context visualization for tabular information , 1994, CHI '94.

[20]  Stephen Ramsay,et al.  Reading Machines: Toward an Algorithmic Criticism , 2011 .

[21]  Daniel A. Keim,et al.  Visual opinion analysis of customer feedback data , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[22]  Daniel A. Keim,et al.  Literature Fingerprinting: A New Method for Visual Literary Analysis , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[23]  Guy Melançon,et al.  Measuring Group Cohesion in Document Collections , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[24]  Martin Wattenberg,et al.  Mapping Text with Phrase Nets , 2009, IEEE Transactions on Visualization and Computer Graphics.

[25]  Jock D. Mackinlay,et al.  The document lens , 1993, UIST '93.

[26]  김종덕,et al.  Interactive. , 1996, Nursing older people.

[27]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[28]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[29]  Thomas Ertl,et al.  SmoothScroll: A Multi-scale, Multi-layer Slider , 2011, ICCV 2011.

[30]  Mats Malm,et al.  Advanced Visual Analytics Methods for Literature Analysis , 2012, LaTeCH@EACL.

[31]  Thomas Ertl,et al.  Visual Classifier Training for Text Document Retrieval , 2012, IEEE Transactions on Visualization and Computer Graphics.

[32]  Stephen G. Eick,et al.  Seesoft-A Tool For Visualizing Line Oriented Software Statistics , 1992, IEEE Trans. Software Eng..

[33]  Martin Wattenberg,et al.  ManyEyes: a Site for Visualization at Internet Scale , 2007, IEEE Transactions on Visualization and Computer Graphics.

[34]  John T. Stasko,et al.  Jigsaw: Supporting Investigative Analysis through Interactive Visualization , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[35]  Shimei Pan,et al.  TIARA: Interactive, Topic-Based Visual Text Summarization and Analysis , 2012, TIST.

[36]  Chris North,et al.  Semantic Interaction for Sensemaking: Inferring Analytical Reasoning for Model Steering , 2012, IEEE Transactions on Visualization and Computer Graphics.

[37]  Chris North,et al.  Semantics of Directly Manipulating Spatializations , 2013, IEEE Transactions on Visualization and Computer Graphics.

[38]  Catherine Plaisant,et al.  What's being said near “Martha”? Exploring name entities in literary text collections , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[39]  P. Pirolli,et al.  The Sensemaking Process and Leverage Points for Analyst Technology as Identified Through Cognitive Task Analysis , 2007 .

[40]  Daniel A. Keim,et al.  Fingerprint Matrices: Uncovering the dynamics of social networks in prose literature , 2013, Comput. Graph. Forum.

[41]  James J. Thomas,et al.  Challenges for Visual Analytics , 2009, Inf. Vis..

[42]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[43]  Marti A. Hearst TileBars: visualization of term distribution information in full text information access , 1995, CHI '95.

[44]  Michael Gleicher,et al.  Exploring Collections of Tagged Text for Literary Scholarship , 2011, Comput. Graph. Forum.

[45]  James J. Thomas,et al.  Visualizing the non-visual: spatial analysis and interaction with information from text documents , 1995, Proceedings of Visualization 1995 Conference.

[46]  Ben Shneiderman,et al.  Discovering interesting usage patterns in text collections: integrating text mining with visualization , 2007, CIKM '07.

[47]  Martin Wattenberg,et al.  The Word Tree, an Interactive Visual Concordance , 2008, IEEE Transactions on Visualization and Computer Graphics.

[48]  Sandra Richter,et al.  A History of Poetics: German Scholarly Aesthetics and Poetics in International Context, 1770-1960 , 2010 .