The human is the loop: new directions for visual analytics

Visual analytics is the science of marrying interactive visualizations and analytic algorithms to support exploratory knowledge discovery in large datasets. We argue for a shift from a ‘human in the loop’ philosophy for visual analytics to a ‘human is the loop’ viewpoint, where the focus is on recognizing analysts’ work processes, and seamlessly fitting analytics into that existing interactive process. We survey a range of projects that provide visual analytic support contextually in the sensemaking loop, and outline a research agenda along with future challenges.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  Daniel A. Keim,et al.  Visual analytics: how much visualization and how much analytics? , 2010, SKDD.

[3]  Stephanie D. Teasley,et al.  Perspectives on socially shared cognition , 1991 .

[4]  Jimin Liang,et al.  Automatic X-ray image segmentation for threat detection , 2003, Proceedings Fifth International Conference on Computational Intelligence and Multimedia Applications. ICCIMA 2003.

[5]  William Ribarsky,et al.  Building and applying a human cognition model for visual analytics , 2009 .

[6]  Naren Ramakrishnan,et al.  Connecting the Dots between PubMed Abstracts , 2012, PloS one.

[7]  Ian Davidson,et al.  Flexible constrained spectral clustering , 2010, KDD.

[8]  Jean-Daniel Fekete,et al.  NodeTrix: a Hybrid Visualization of Social Networks , 2007, IEEE Transactions on Visualization and Computer Graphics.

[9]  T. M. Murali,et al.  Compositional mining of multirelational biological datasets , 2008, TKDD.

[10]  Emmanuel Pietriga,et al.  OntoTrix: a hybrid visualization for populated ontologies , 2011, WWW.

[11]  Ramanathan V. Guha,et al.  Unweaving a web of documents , 2005, KDD '05.

[12]  Louise E. Moser,et al.  Extracting data records from the web using tag path clustering , 2009, WWW '09.

[13]  Andrey Rzhetsky,et al.  Microparadigms: chains of collective reasoning in publications about molecular interactions. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Michael H. Böhlen,et al.  Visual Data Mining: An Introduction and Overview , 2008, Visual Data Mining.

[15]  Thorsten Joachims,et al.  Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases , 2007, KDD '07.

[16]  Steven M. Drucker,et al.  Helping Users Sort Faster with Adaptive Machine Learning Recommendations , 2011, INTERACT.

[17]  P. Pirolli,et al.  The Sensemaking Process and Leverage Points for Analyst Technology as Identified Through Cognitive Task Analysis , 2007 .

[18]  Chris North,et al.  Observation-level interaction with statistical models for visual analytics , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[19]  S. S. Ravi,et al.  Clustering with Constraints: Feasibility Issues and the k-Means Algorithm , 2005, SDM.

[20]  Frank M. Shipman,et al.  Manipulating structured information in a visual workspace , 2002, UIST '02.

[21]  Jill P. Mesirov,et al.  Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data , 2003, Machine Learning.

[22]  Christopher Andrews,et al.  Helping Intelligence Analysts Make Connections , 2011, Scalable Integration of Analytics and Visualization.

[23]  Jingjing Liu,et al.  Find distance function, hide model inference , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[24]  Allan Kuchinsky,et al.  Biological storytelling: a software tool for biological information organization based upon narrative structure , 2002 .

[25]  Frank M. Shipman,et al.  VIKI: spatial hypertext supporting emergent structure , 1994, ECHT '94.

[26]  Christopher Andrews,et al.  ChairMouse: leveraging natural chair rotation for cursor navigation on large, high-resolution displays , 2011, CHI Extended Abstracts.

[27]  William Ribarsky,et al.  Building and Applying a Human Cognition Model for Visual Analytics , 2009, Inf. Vis..

[28]  John T. Stasko,et al.  Jigsaw: Supporting Investigative Analysis through Interactive Visualization , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[29]  Carla E. Brodley,et al.  Interactive Content-based Image Retrieval Using Relevance Feedback , 2002 .

[30]  M. Shahriar Hossain,et al.  Unifying dependent clustering and disparate clustering for non-homogeneous data , 2010, KDD.

[31]  M. Shahriar Hossain,et al.  Scatter/Gather Clustering: Flexibly Incorporating User Feedback to Steer Clustering Results , 2012, IEEE Transactions on Visualization and Computer Graphics.

[32]  M. Shahriar Hossain,et al.  Coordinated clustering algorithms to support charging infrastructure design for electric vehicles , 2012, UrbComp '12.

[33]  Shinichi Morishita,et al.  Constrained clusters of gene expression profiles with pathological features , 2004, Bioinform..

[34]  Carla E. Brodley,et al.  Dis-function: Learning distance functions interactively , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[35]  S. S. Ravi,et al.  Efficient incremental constrained clustering , 2007, KDD '07.

[36]  Ziv Bar-Joseph,et al.  Clustering short time series gene expression data , 2005, ISMB.

[37]  Saeed Reza Aghabozorgi Sahaf Yazdi Recommender systems: incremental clustering on web log data , 2009 .

[38]  Alex Baron,et al.  Who is Who and What is What: Experiments in Cross-Document Co-Reference , 2008, EMNLP.

[39]  Niklas Elmqvist,et al.  Fluid interaction for information visualization , 2011, Inf. Vis..

[40]  Ying Xu,et al.  Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees , 2002, Bioinform..

[41]  Caitlin Kelleher,et al.  Using storytelling to motivate programming , 2007, CACM.

[42]  Robert L. Grossman KDD-2005 : proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 21-24, 2005, Chicago, Illinois, USA , 2005 .

[43]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[44]  Robert Harper,et al.  Stories in GeoTime , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[45]  M. Shahriar Hossain,et al.  Storytelling in entity networks to support intelligence analysts , 2012, KDD.

[46]  Valery A. Petrushin Mining rare and frequent events in multi-camera surveillance video using self-organizing maps , 2005, KDD '05.

[47]  Herbert H. Clark,et al.  Grounding in communication , 1991, Perspectives on socially shared cognition.

[48]  Tom M. Mitchell,et al.  Text clustering with extended user feedback , 2006, SIGIR.

[49]  Padhraic Smyth,et al.  Model-Based Clustering and Visualization of Navigation Patterns on a Web Site , 2003, Data Mining and Knowledge Discovery.

[50]  Alexander W. Skaburskis,et al.  The Sandbox for analysis: concepts and methods , 2006, CHI.

[51]  Richards J. Heuer,et al.  Psychology of Intelligence Analysis , 1999 .

[52]  Ye Zhao,et al.  STREAMIT: Dynamic visualization and interactive exploration of text streams , 2011, 2011 IEEE Pacific Visualization Symposium.

[53]  Patrick O. Fiaux Solving Intelligence Analysis Problems using Biclusters , 2012 .

[54]  Claire Cardie,et al.  Constrained K-means Clustering with Background Knowledge , 2001, ICML.

[55]  Anthony C. Robinson,et al.  Design for synthesis in geovisualization , 2008 .

[56]  Chris North,et al.  Semantic Interaction for Sensemaking: Inferring Analytical Reasoning for Model Steering , 2012, IEEE Transactions on Visualization and Computer Graphics.

[57]  Analyst's Workspace: An embodied sensemaking environment for large, high-resolution displays , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[58]  John T. Stasko,et al.  Evaluating visual analytics systems for investigative analysis: Deriving design principles from a case study , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[59]  Christopher Andrews,et al.  Space to think: large high-resolution displays for sensemaking , 2010, CHI.

[60]  Robert Harper,et al.  Stories in GeoTime , 2007 .

[61]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[62]  Richard May,et al.  Foundations and Frontiers in Visual Analytics , 2009, Inf. Vis..

[63]  Michael H. Böhlen,et al.  Visual Data Mining - Theory, Techniques and Tools for Visual Analytics , 2008, Visual Data Mining.

[64]  Chris North,et al.  The semantics of clustering: analysis of user-generated spatializations of text documents , 2012, AVI.

[65]  Ingo Hotz,et al.  iPCA : An Interactive System for PCA-based Visual Analytics , 2008 .

[66]  Marti A. Hearst,et al.  Scatter/gather browsing communicates the topic structure of a very large text collection , 1996, CHI.

[67]  Hao Wu,et al.  Where do I start?: algorithmic strategies to guide intelligence analysts , 2012, ISI-KDD '12.

[68]  M. Shahriar Hossain,et al.  Narratives in the Network: Interactive Methods for Mining Cell Signaling Networks , 2012, J. Comput. Biol..

[69]  Chris North,et al.  Semantic interaction for visual text analytics , 2012, CHI.

[70]  Frank M. Shipman,et al.  Formality Considered Harmful: Experiences, Emerging Themes, and Directions on the Use of Formal Representations in Interactive Systems , 1999, Computer Supported Cooperative Work (CSCW).

[71]  Naren Ramakrishnan,et al.  Algorithms for Storytelling , 2006, IEEE Transactions on Knowledge and Data Engineering.

[72]  Jarke J. van Wijk,et al.  Cluster and Calendar Based Visualization of Time Series Data , 1999, INFOVIS.

[73]  Sang-goo Lee,et al.  Exploiting user feedback to improve quality of search results clustering , 2011, ICUIMC '11.

[74]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[75]  Omar Alonso,et al.  Structuring collections with Scatter/Gather extensions , 2008, SIGIR '08.

[76]  Chris North,et al.  Visualizing cyber security: Usable workspaces , 2009, 2009 6th International Workshop on Visualization for Cyber Security.

[77]  Hiroki Arimura,et al.  LCM: An Efficient Algorithm for Enumerating Frequent Closed Item Sets , 2003, FIMI.