Insight-centric Visualization Recommendation

Visualization recommendation systems simplify exploratory data analysis (EDA) and make understanding data more accessible to users of all skill levels by automatically generating visualizations for users to explore. However, most existing visualization recommendation systems focus on ranking all visualizations into a single list or set of groups based on particular attributes or encodings. This global ranking makes it difficult and time-consuming for users to find the most interesting or relevant insights. To address these limitations, we introduce a novel class of visualization recommendation systems that automatically rank and recommend both groups of related insights as well as the most important insights within each group. Our proposed approach combines results from many different learning-based methods to discover insights automatically. A key advantage is that this approach generalizes to a wide variety of attribute types such as categorical, numerical, and temporal, as well as complex non-trivial combinations of these different attribute types. To evaluate the effectiveness of our approach, we implemented a new insight-centric visualization recommendation system, SpotLight, which generates and ranks annotated visualizations to explain each insight. We conducted a user study with 12 participants and two datasets which showed that users are able to quickly understand and find relevant insights in unfamiliar data.

[1]  Ryan A. Rossi,et al.  Personalized Visualization Recommendation , 2021, ACM Trans. Web.

[2]  ML-based Visualization Recommendation: Learning to Recommend Visualizations from Data , 2020, ArXiv.

[3]  Yun Wang,et al.  DataShot: Automatic Generation of Fact Sheets from Tabular Data , 2020, IEEE Transactions on Visualization and Computer Graphics.

[4]  Yong Xu,et al.  QuickInsights: Quick and Automatic Discovery of Insights from Multi-Dimensional Data , 2019, SIGMOD Conference.

[5]  Jeffrey Heer,et al.  Formalizing Visualization Design Knowledge as Constraints: Actionable and Extensible Models in Draco , 2018, IEEE Transactions on Visualization and Computer Graphics.

[6]  Tim Kraska,et al.  VizML: A Machine Learning Approach to Visualization Recommendation , 2018, CHI.

[7]  Çagatay Demiralp,et al.  Data2Vis: Automatic Generation of Data Visualizations Using Sequence-to-Sequence Recurrent Neural Networks , 2018, IEEE Computer Graphics and Applications.

[8]  Niklas Elmqvist,et al.  DataSite: Proactive visual data exploration with computation of insight-based recommendations , 2018, Inf. Vis..

[9]  Vijay V. Raghavan,et al.  Spatio-temporal outlier detection algorithms based on computing behavioral outlierness factor , 2017, Data Knowl. Eng..

[10]  Michael Rohs,et al.  Online Learning of Visualization Preferences through Dueling Bandits for Enhancing Visualization Recommendations , 2019, EuroVis.

[11]  Alex Endert,et al.  Augmenting Visualizations with Interactive Data Facts to Facilitate Interpretation and Communication , 2019, IEEE Transactions on Visualization and Computer Graphics.

[12]  Guoliang Li,et al.  DeepEye: Towards Automatic Data Visualization , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[13]  J. Michael Herrmann,et al.  A Review of No Free Lunch Theorems, and Their Implications for Metaheuristic Optimisation , 2018 .

[14]  Wei Xiao An Online Algorithm for Nonparametric Correlations , 2017, 1712.01521.

[15]  Momiao Xiong,et al.  Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables , 2017, Scientific Reports.

[16]  Peter J. Haas,et al.  Foresight: Recommending Visual Insights , 2017, Proc. VLDB Endow..

[17]  James Bailey,et al.  Unbiased Multivariate Correlation Analysis , 2017, AAAI.

[18]  Qi Liu,et al.  Unsupervised detection of contextual anomaly in remotely sensed data , 2017 .

[19]  Aditya G. Parameswaran,et al.  Towards Visualization Recommendation Systems , 2016, SGMD.

[20]  A. Duraj,et al.  Outlier detection using the multiobjective genetic algorithm , 2017 .

[21]  A. Deardorff Tableau (version. 9.1) , 2016 .

[22]  John Lee,et al.  Effortless Data Exploration with zenvisage: An Expressive and Interactive Visual Analytics System , 2016, Proc. VLDB Endow..

[23]  Kanit Wongsuphasawat,et al.  Voyager: Exploratory Analysis via Faceted Browsing of Visualization Recommendations , 2016, IEEE Transactions on Visualization and Computer Graphics.

[24]  Gilles Venturini,et al.  VizAssist: an interactive user assistant for visual data mining , 2016, The Visual Computer.

[25]  Aditya G. Parameswaran,et al.  SeeDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics , 2015, Proc. VLDB Endow..

[26]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data: A Survey , 2014, IEEE Transactions on Knowledge and Data Engineering.

[27]  Brent J. Hecht,et al.  NewsViews: an automated pipeline for creating custom geovisualizations for news , 2014, CHI.

[28]  Gautam Shroff,et al.  Email Analytics for Activity Management and Insight Discovery , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[29]  Qing Liu,et al.  Designing Discovery Experience for Big Data Interaction: A Case of Web-Based Knowledge Mining and Interactive Visualization Platform , 2013, HCI.

[30]  Samuel Madden,et al.  Scorpion: Explaining Away Outliers in Aggregate Queries , 2013, Proc. VLDB Endow..

[31]  Cecilia R. Aragon,et al.  VizDeck: Streamlining exploratory visual analytics of scientific data , 2013 .

[32]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[33]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[34]  Leland Wilkinson,et al.  AutoVis: Automatic Visualization , 2010, Inf. Vis..

[35]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[36]  William Ribarsky,et al.  Defining Insight for Visual Analytics , 2009, IEEE Computer Graphics and Applications.

[37]  Girish Keshav Palshikar Simple Algorithms for Peak Detection in Time-Series , 2009 .

[38]  Pat Hanrahan,et al.  Show Me: Automatic Presentation for Visual Analysis , 2007, IEEE Transactions on Visualization and Computer Graphics.

[39]  Jean-Daniel Fekete,et al.  Task taxonomy for graph visualization , 2006, BELIV '06.

[40]  Ivan Bratko,et al.  VizRank: Data Visualization Guided by Machine Learning , 2006, Data Mining and Knowledge Discovery.

[41]  James R. Eagan,et al.  Low-level components of analytic activity in information visualization , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[42]  Ben Shneiderman,et al.  A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data , 2005, Inf. Vis..

[43]  Q. Wang,et al.  A nonlinear correlation measure for multivariable data set , 2005 .

[44]  Zengyou He,et al.  Discovering cluster-based local outliers , 2003, Pattern Recognit. Lett..

[45]  Y. Ho,et al.  Simple Explanation of the No-Free-Lunch Theorem and Its Implications , 2002 .

[46]  Yves D. Jean,et al.  Dataspace: an automated visualization system for large databases , 1997, Electronic Imaging.

[47]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[48]  Steven F. Roth,et al.  Toward an Information Visualization Workspace: Combining Multiple Means of Expression , 1997, Hum. Comput. Interact..

[49]  D. Wolpert,et al.  No Free Lunch Theorems for Search , 1995 .

[50]  R. Agarwal Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[51]  Stephen M. Casner,et al.  Task-analytic approach to the automated design of graphic presentations , 1991, TOGS.