DIVE: A Mixed-Initiative System Supporting Integrated Data Exploration Workflows

Generating knowledge from data is an increasingly important activity. This process of data exploration consists of multiple tasks: data ingestion, visualization, statistical analysis, and storytelling. Though these tasks are complementary, analysts often execute them in separate tools. Moreover, these tools have steep learning curves due to their reliance on manual query specification. Here, we describe the design and implementation of DIVE, a web-based system that integrates state-of-the-art data exploration features into a single tool. DIVE contributes a mixed-initiative interaction scheme that combines recommendation with point-and-click manual specification, and a consistent visual language that unifies different stages of the data exploration workflow. In a controlled user study with 67 professional data scientists, we find that DIVE users were significantly more successful and faster than Excel users at completing predefined data visualization and analysis tasks.

[1]  Leland Wilkinson,et al.  AutoVis: Automatic Visualization , 2010, Inf. Vis..

[2]  Martin Wattenberg,et al.  ManyEyes: a Site for Visualization at Internet Scale , 2007, IEEE Transactions on Visualization and Computer Graphics.

[3]  John Lee,et al.  zenvisage: Effortless Visual Data Exploration , 2016, ArXiv.

[4]  Kanit Wongsuphasawat,et al.  Voyager 2: Augmenting Visual Analysis with Partial View Specifications , 2017, CHI.

[5]  Jarke J. van Wijk,et al.  Small Multiples, Large Singles: A New Approach for Visual Data Exploration , 2013, Comput. Graph. Forum.

[6]  Kanit Wongsuphasawat,et al.  Towards a general-purpose query language for visualization recommendation , 2016, HILDA '16.

[7]  Ben Shneiderman,et al.  Interactive Dynamics for Visual Analysis , 2012 .

[8]  Jock D. Mackinlay,et al.  Automating the design of graphical presentations of relational information , 1986, TOGS.

[9]  Jeffrey Heer,et al.  Profiler: integrated statistical analysis and visualization for data quality assessment , 2012, AVI.

[10]  Cecilia R. Aragon,et al.  VizDeck: Streamlining exploratory visual analytics of scientific data , 2013 .

[11]  Ben Shneiderman,et al.  Visual Information Seeking: Tight Coupling of Dynamic Query Filters with Starfield Displays , 1994 .

[12]  Genifer Snipes,et al.  Google Data Studio , 2018 .

[13]  Michele Mauri,et al.  RAWGraphs: A Visualisation Platform to Create Open Outputs , 2017, CHItaly.

[14]  Jeffrey Heer,et al.  Enterprise Data Analysis and Visualization: An Interview Study , 2012, IEEE Transactions on Visualization and Computer Graphics.

[15]  John J. Bertin,et al.  The semiology of graphics , 1983 .

[16]  Niklas Elmqvist,et al.  Keshif: Rapid and Expressive Tabular Data Exploration for Novices , 2018, IEEE Transactions on Visualization and Computer Graphics.

[17]  S S Stevens,et al.  On the Theory of Scales of Measurement. , 1946, Science.

[18]  R. Grossman,et al.  Graph-theoretic scagnostics , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[19]  Pat Hanrahan,et al.  Show Me: Automatic Presentation for Visual Analysis , 2007, IEEE Transactions on Visualization and Computer Graphics.

[20]  Christopher Ahlberg,et al.  Spotfire: an information exploration environment , 1996, SGMD.

[21]  Ben Shneiderman,et al.  A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data , 2005, Inf. Vis..

[22]  Kanit Wongsuphasawat,et al.  Voyager: Exploratory Analysis via Faceted Browsing of Visualization Recommendations , 2016, IEEE Transactions on Visualization and Computer Graphics.

[23]  Aditya G. Parameswaran,et al.  SEEDB: Automatically Generating Query Visualizations , 2014, Proc. VLDB Endow..

[24]  Pat Hanrahan,et al.  Polaris: a system for query, analysis, and visualization of multidimensional databases , 2008, Commun. ACM.

[25]  Eric Horvitz,et al.  Principles of mixed-initiative user interfaces , 1999, CHI '99.

[26]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..