Table2Charts: Recommending Charts by Learning Shared Table Representations

It is common for people to create different types of charts to explore a multi-dimensional dataset (table). However, to build an intelligent assistant that recommends commonly composed charts, the fundamental problems of "multi-dialect" unification, imbalanced data and open vocabulary exist. In this paper, we propose Table2Charts framework which learns common patterns from a large corpus of (table, charts) pairs. Based on deep Q-learning with copying mechanism and heuristic searching, Table2Charts does table-to-sequence generation, where each sequence follows a chart template. On a large spreadsheet corpus with 196k tables and 306k charts, we show that Table2Charts could learn a shared representation of table fields so that tasks on different chart types could mutually enhance each other. Table2Charts has >0.61 recall at top-3 and >0.49 recall at top-1 for both single-type and multi-type chart recommendation tasks.

[1]  Çagatay Demiralp,et al.  Data2Vis: Automatic Generation of Data Visualizations Using Sequence-to-Sequence Recurrent Neural Networks , 2018, IEEE Computer Graphics and Applications.

[2]  Man Lung Yiu,et al.  Extracting Top-K Insights from Multi-dimensional Data , 2017, SIGMOD Conference.

[3]  Patrick Marcel,et al.  A survey of query recommendation techniques for data warehouse exploration , 2011, EDA.

[4]  Zhouyu Fu,et al.  Semantic Structure Extraction for Spreadsheet Tables with a Multi-task Learning Architecture , 2019 .

[5]  Tova Milo,et al.  Next-Step Suggestions for Modern Interactive Data Analysis Platforms , 2018, KDD.

[6]  Kanit Wongsuphasawat,et al.  Voyager: Exploratory Analysis via Faceted Browsing of Visualization Recommendations , 2016, IEEE Transactions on Visualization and Computer Graphics.

[7]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[8]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[9]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[10]  Jeffrey Heer,et al.  Formalizing Visualization Design Knowledge as Constraints: Actionable and Extensible Models in Draco , 2018, IEEE Transactions on Visualization and Computer Graphics.

[11]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[12]  Tim Kraska,et al.  VizML: A Machine Learning Approach to Visualization Recommendation , 2018, CHI.

[13]  Arvind Satyanarayan,et al.  Vega-Lite: A Grammar of Interactive Graphics , 2018, IEEE Transactions on Visualization and Computer Graphics.

[14]  Martin Wattenberg,et al.  Embedding Projector: Interactive Visualization and Interpretation of Embeddings , 2016, ArXiv.

[15]  Yu Zhang,et al.  A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[16]  Xiaodong Liu,et al.  RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers , 2020, ACL.

[17]  Marc'Aurelio Ranzato,et al.  Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[18]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[19]  Mohamed A. Sharaf,et al.  MuVE: Efficient Multi-Objective View Recommendation for Visual Data Exploration , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[20]  Guoliang Li,et al.  DeepEye: Towards Automatic Data Visualization , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).