Steerable Self-Driving Data Visualization

3 Abstract—In this work, we present a self-driving data visualization system, called DEEPEYE, that automatically generates and recommends 4 visualizations based on the idea of visualization by examples.Wepropose effective visualization recognition techniques to decide which 5 visualizations aremeaningful and visualization ranking techniques to rank the good visualizations. Furthermore, amain challenge of 6 automatic visualization system is that the usersmay bemisled by blindly suggesting visualizations without knowing the user’s intent. To this 7 end,we extend DEEPEYE to be easily steerable by allowing the user to use keyword search and providing click-based faceted navigation. 8 Empirical results, using real-life data and use cases, verify the power of our proposed system.

[1]  Fotis Psallidas,et al.  Combining Design and Performance in a Data Visualization Management System , 2017, CIDR.

[2]  Ravishankar K. Iyer,et al.  Experimental evaluation , 1995 .

[3]  M. Kuby Programming Models for Facility Dispersion: The p‐Dispersion and Maxisum Dispersion Problems , 2010 .

[4]  Refael Hassin,et al.  Approximation algorithms for maximum dispersion , 1997, Oper. Res. Lett..

[5]  Pat Hanrahan,et al.  Show Me: Automatic Presentation for Visual Analysis , 2007, IEEE Transactions on Visualization and Computer Graphics.

[6]  Jock D. Mackinlay,et al.  Automating the design of graphical presentations of relational information , 1986, TOGS.

[7]  Arvind Satyanarayan,et al.  Lyra: An Interactive Visualization Design Environment , 2014, Comput. Graph. Forum.

[8]  Kanit Wongsuphasawat,et al.  Voyager: Exploratory Analysis via Faceted Browsing of Visualization Recommendations , 2016, IEEE Transactions on Visualization and Computer Graphics.

[9]  Aditya G. Parameswaran,et al.  Towards Visualization Recommendation Systems , 2016, SGMD.

[10]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[11]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[12]  Melanie Tory,et al.  How Information Visualization Novices Construct Visualizations , 2010, IEEE Trans. Vis. Comput. Graph..

[13]  John Lee,et al.  Effortless Data Exploration with zenvisage: An Expressive and Interactive Visual Analytics System , 2016, Proc. VLDB Endow..

[14]  Xiaolin Li,et al.  Statistical learning for semantic parsing: A survey , 2019, Big Data Min. Anal..

[15]  Hongzhi Wang,et al.  Mining conditional functional dependency rules on big data , 2020, Big Data Min. Anal..

[16]  Tim Kraska,et al.  Toward Sustainable Insights, or Why Polygamy is Bad for You , 2017, CIDR.

[17]  Guoliang Li,et al.  Crowdsourced Top-k Algorithms: An Experimental Evaluation , 2016, Proc. VLDB Endow..

[18]  Rong Jin,et al.  Learning to Rank by Optimizing NDCG Measure , 2009, NIPS.

[19]  Guoliang Li,et al.  DeepEye: Towards Automatic Data Visualization , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[20]  Satoshi Matsuoka,et al.  Scaling Word2Vec on Big Corpus , 2019, Data Science and Engineering.

[21]  W. Cleveland,et al.  Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods , 1984 .

[22]  Guoliang Li,et al.  DeepEye: Visualizing Your Data by Keyword Search , 2018, EDBT.

[23]  Karrie Karahalios,et al.  DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization , 2015, UIST.

[24]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[25]  Pat Hanrahan,et al.  VizQL: a language for query, analysis and visualization , 2006, SIGMOD Conference.

[26]  Aditya G. Parameswaran,et al.  SeeDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics , 2015, Proc. VLDB Endow..

[27]  Guoliang Li,et al.  Making data visualization more efficient and effective: a survey , 2019, The VLDB Journal.

[28]  Guoliang Li,et al.  DeepEye: Creating Good Data Visualizations by Keyword Search , 2018, SIGMOD Conference.

[29]  Guoliang Li,et al.  Approximate Query Processing: What is New and Where to Go? , 2018, Data Science and Engineering.

[30]  Jianfeng Gao,et al.  Ranking, Boosting, and Model Adaptation , 2008 .

[31]  Charles L. A. Clarke,et al.  Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[32]  Guoliang Li,et al.  Interactive Cleaning for Progressive Visualization through Composite Questions , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[33]  Jeffrey Heer,et al.  Profiler: integrated statistical analysis and visualization for data quality assessment , 2012, AVI.