Interactive Visual Analytics for Sensemaking with Big Text

Abstract Analysts face many steep challenges when performing sensemaking tasks on collections of textual information larger than can be reasonably analyzed without computational assistance. To scale up such sensemaking tasks, new methods are needed to interactively integrate human cognitive sensemaking activity with machine learning. Towards that goal, we offer a human-in-the-loop computational model that mirrors the human sensemaking process, and consists of foraging and synthesis sub-processes. We model the synthesis loop as an interactive spatial projection and the foraging loop as an interactive relevance ranking combined with topic modeling. We combine these two components of the sensemaking process using semantic interaction such that the human's spatial synthesis actions are transformed into automated foraging and synthesis of new relevant information. Ultimately, the model's ability to forage as a result of the analyst's synthesis activities makes interacting with big text data easier and more efficient, thereby facilitating analysts' sensemaking ability. We discuss the interaction design and theory behind our interactive sensemaking model. The model is embodied in a novel visual analytics prototype called Cosmos in which analysts synthesize structure within the larger corpus by directly interacting with a reduced-dimensionality space to express relationships on a subset of data. We then demonstrate how Cosmos supports sensemaking tasks with a realistic scenario that investigates the affect of natural disasters in Adelaide, Australia in September 2016 using a database of over 30,000 news articles.

[1]  Chris North,et al.  Semantic interaction for visual text analytics , 2012, CHI.

[2]  Michael C. Hout,et al.  Multidimensional Scaling , 2003, Encyclopedic Dictionary of Archaeology.

[3]  Anthony C. Robinson,et al.  Collaborative synthesis of visual analytic results , 2008, 2008 IEEE Symposium on Visual Analytics Science and Technology.

[4]  Terry Winograd,et al.  SenseMaker: an information-exploration interface supporting the contextual evolution of a user's interests , 1997, CHI.

[5]  Alex Endert,et al.  InterAxis: Steering Scatterplot Axes via Observation-Level Interaction , 2016, IEEE Transactions on Visualization and Computer Graphics.

[6]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[7]  Qiang Zhang,et al.  TIARA: a visual exploratory text analytic system , 2010, KDD '10.

[8]  Analyst's Workspace: An embodied sensemaking environment for large, high-resolution displays , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[9]  Chris North,et al.  A Bidirectional Pipeline for Semantic Interaction , 2018 .

[10]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[11]  Carla E. Brodley,et al.  Dis-function: Learning distance functions interactively , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[12]  Designing Usable Interactive Visual Analytics Tools for Dimension Reduction , 2016 .

[13]  Frank M. Shipman,et al.  Formality Considered Harmful: Experiences, Emerging Themes, and Directions on the Use of Formal Representations in Interactive Systems , 1999, Computer Supported Cooperative Work (CSCW).

[14]  Ronen Feldman,et al.  Book Reviews: The Text Mining Handbook: Advanced Approaches to Analyzing Unstructured Data by Ronen Feldman and James Sanger , 2008, CL.

[15]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[16]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[17]  Chris North,et al.  Observation-Level Interaction with Clustering and Dimension Reduction Algorithms , 2017, HILDA@SIGMOD.

[18]  Clinton Gormley,et al.  Elasticsearch: The Definitive Guide , 2015 .

[19]  Zhe Chen,et al.  Inference for the Number of Topics in the Latent Dirichlet Allocation Model via Bayesian Mixture Modeling , 2019 .

[20]  Christopher Andrews,et al.  The human is the loop: new directions for visual analytics , 2014, Journal of Intelligent Information Systems.

[21]  Lior Rokach,et al.  Introduction to Recommender Systems Handbook , 2011, Recommender Systems Handbook.

[22]  Aravind Srinivasan,et al.  'Beating the news' with EMBERS: forecasting civil unrest using open source indicators , 2014, KDD.

[23]  P. Pirolli,et al.  The Sensemaking Process and Leverage Points for Analyst Technology as Identified Through Cognitive Task Analysis , 2015 .

[24]  Alfred Kobsa,et al.  The Adaptive Web, Methods and Strategies of Web Personalization , 2007, The Adaptive Web.

[25]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[26]  Chris North,et al.  Observation-Level and Parametric Interaction for High-Dimensional Data Analysis , 2018, ACM Trans. Interact. Intell. Syst..

[27]  Chris North,et al.  Big Text Visual Analytics in Sensemaking , 2015, 2015 Big Data Visual Analytics (BDVA).

[28]  Chris North,et al.  SIRIUS: Dual, Symmetric, Interactive Dimension Reductions , 2019, IEEE Transactions on Visualization and Computer Graphics.

[29]  Malcolm Slaney,et al.  Being Literate with Large Document Collections: Observational Studies and Cost Structure Tradeoffs , 2006, Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06).

[30]  Chris North,et al.  Semantic Interaction for Sensemaking: Inferring Analytical Reasoning for Model Steering , 2012, IEEE Transactions on Visualization and Computer Graphics.

[31]  Chao Han,et al.  Bayesian visual analytics: BaVA , 2015, Stat. Anal. Data Min..

[32]  John T. Stasko,et al.  Jigsaw: Supporting Investigative Analysis through Interactive Visualization , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[33]  Alex Endert,et al.  Semantic Interaction for Visual Analytics: Toward Coupling Cognition and Computation , 2014, IEEE Computer Graphics and Applications.

[34]  Chris North,et al.  Observation-level interaction with statistical models for visual analytics , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[35]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[36]  Chris North,et al.  Bridging the gap between user intention and model parameters for human-in-the-loop data analytics , 2016, HILDA '16.

[37]  Ed Huai-hsin Chi,et al.  Entity Workspace: An Evidence File That Aids Memory, Inference, and Reading , 2006, ISI.

[38]  Chris North,et al.  Multi-model semantic interaction for text analytics , 2014, 2014 IEEE Conference on Visual Analytics Science and Technology (VAST).

[39]  Christopher Andrews,et al.  Space to think: large high-resolution displays for sensemaking , 2010, CHI.

[40]  Lauren Bradel,et al.  The Effect of Semantic Interaction on Foraging in Text Analysis , 2018, 2018 IEEE Conference on Visual Analytics Science and Technology (VAST).

[41]  Luis Gustavo Nonato,et al.  Local Affine Multidimensional Projection , 2011, IEEE Transactions on Visualization and Computer Graphics.

[42]  C. North,et al.  Visual to Parametric Interaction (V2PI) , 2013, PloS one.

[43]  Dorota Glowacka,et al.  Directing exploratory search with interactive intent modeling , 2013, CIKM.