Talk2Data: A Natural Language Interface for Exploratory Visual Analysis via Question Decomposition

Through a natural language interface (NLI) for exploratory visual analysis, users can directly"ask"analytical questions about the given tabular data. This process greatly improves user experience and lowers the technical barriers of data analysis. Existing techniques focus on generating a visualization from a concrete question. However, complex questions, requiring multiple data queries and visualizations to answer, are frequently asked in data exploration and analysis, which cannot be easily solved with the existing techniques. To address this issue, in this paper, we introduce Talk2Data, a natural language interface for exploratory visual analysis that supports answering complex questions. It leverages an advanced deep-learning model to resolve complex questions into a series of simple questions that could gradually elaborate on the users' requirements. To present answers, we design a set of annotated and captioned visualizations to represent the answers in a form that supports interpretation and narration. We conducted an ablation study and a controlled user study to evaluate Talk2Data's effectiveness and usefulness.

[1]  Chengliang Chai,et al.  Natural Language to Visualization by Neural Machine Translation , 2021, IEEE Transactions on Visualization and Computer Graphics.

[2]  Nan Cao,et al.  AutoClips: An Automatic Approach to Video Generation from Data Facts , 2021, Comput. Graph. Forum.

[3]  Xiaoru Yuan,et al.  ADVISor: Automatic Visualization Answer for Natural-Language Question on Tabular Data , 2021, 2021 IEEE 14th Pacific Visualization Symposium (PacificVis).

[4]  Yang Shi,et al.  Calliope: Automatic Visual Data Story Generation from a Spreadsheet , 2020, IEEE Transactions on Visualization and Computer Graphics.

[5]  John Stasko,et al.  NL4DV: A Toolkit for Generating Analytic Specifications for Data Visualization from Natural Language Queries , 2020, IEEE Transactions on Visualization and Computer Graphics.

[6]  Apoorv Saxena,et al.  Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings , 2020, ACL.

[7]  Graham Neubig,et al.  TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data , 2020, ACL.

[8]  Yaohui Jin,et al.  Copy or Rewrite: Hybrid Summarization with Hierarchical Reinforcement Learning , 2020, AAAI.

[9]  Bongshin Lee,et al.  Interweaving Multimodal Interaction With Flexible Unit Visualizations for Data Exploration , 2020, IEEE Transactions on Visualization and Computer Graphics.

[10]  Kyunghyun Cho,et al.  Unsupervised Question Decomposition for Question Answering , 2020, EMNLP.

[11]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[12]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[13]  Bowen Yu,et al.  FlowSense: A Natural Language Interface for Visual Data Exploration within a Dataflow System , 2019, IEEE Transactions on Visualization and Computer Graphics.

[14]  Ran El-Yaniv,et al.  Multi-Hop Paragraph Retrieval for Open-Domain Question Answering , 2019, ACL.

[15]  Hannaneh Hajishirzi,et al.  Multi-hop Reading Comprehension through Question Decomposition and Rescoring , 2019, ACL.

[16]  Leilani Battle,et al.  Characterizing Exploratory Visual Analysis: A Literature Review and Evaluation of Analytic Provenance in Tableau , 2019, Comput. Graph. Forum.

[17]  Chang Zhou,et al.  Cognitive Graph for Multi-Hop Reading Comprehension at Scale , 2019, ACL.

[18]  Arvind Satyanarayan,et al.  Vega-Lite: A Grammar of Interactive Graphics , 2018, IEEE Transactions on Visualization and Computer Graphics.

[19]  Seung-won Hwang,et al.  Adversarial TableQA: Attention Supervision for Question Answering on Tables , 2018, ACML.

[20]  Mitesh M. Khapra,et al.  Towards a Better Metric for Evaluating Question Generation Systems , 2018, EMNLP.

[21]  Tao Yu,et al.  TypeSQL: Knowledge-Based Type-Aware Neural Text-to-SQL Generation , 2018, NAACL.

[22]  Claudia Niederée,et al.  A Neural Network-based Framework for Non-factoid Question Answering , 2018, WWW.

[23]  R. Socher,et al.  Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning , 2017, ArXiv.

[24]  Marc'Aurelio Ranzato,et al.  Classical Structured Prediction Losses for Sequence to Sequence Learning , 2017, NAACL.

[25]  Dawn Xiaodong Song,et al.  SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning , 2017, ArXiv.

[26]  John T. Stasko,et al.  Natural Language Interfaces for Data Analysis with Visualization: Considering What Has and Could Be Asked , 2017, EuroVis.

[27]  Tiejun Zhao,et al.  Constraint-Based Question Answering with Knowledge Graph , 2016, COLING.

[28]  Vidya Setlur,et al.  Eviza: A Natural Language Interface for Visual Analysis , 2016, UIST.

[29]  Eduard H. Hovy,et al.  Tables as Semi-structured Knowledge for Question Answering , 2016, ACL.

[30]  Zhengdong Lu,et al.  Incorporating Copying Mechanism in Sequence-to-Sequence Learning , 2016, ACL.

[31]  Kanit Wongsuphasawat,et al.  Voyager: Exploratory Analysis via Faceted Browsing of Visualization Recommendations , 2016, IEEE Transactions on Visualization and Computer Graphics.

[32]  Zhengdong Lu,et al.  Neural Enquirer: Learning to Query Tables in Natural Language , 2016, IEEE Data Eng. Bull..

[33]  Karrie Karahalios,et al.  DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization , 2015, UIST.

[34]  N. Riche,et al.  More Than Telling a Story: Transforming Data into Visually Shared Stories , 2015, IEEE Computer Graphics and Applications.

[35]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[36]  Percy Liang,et al.  Compositional Semantic Parsing on Semi-Structured Tables , 2015, ACL.

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  Richard Socher,et al.  A Neural Network for Factoid Question Answering over Paragraphs , 2014, EMNLP.

[39]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[40]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[41]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[42]  Tiejun Zhao,et al.  Knowledge-Based Question Answering as Machine Translation , 2014, ACL.

[43]  Heidrun Schumann,et al.  A Design Space of Visualization Tasks , 2013, IEEE Transactions on Visualization and Computer Graphics.

[44]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[45]  Yiwen Sun,et al.  Articulate: A Semi-automated Model for Translating Natural Language Queries into Meaningful Visualizations , 2010, Smart Graphics.

[46]  Michelle X. Zhou,et al.  An optimization-based approach to dynamic visual context management , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[47]  James R. Eagan,et al.  Low-level components of analytic activity in information visualization , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[48]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[49]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[50]  Rebecca E. Grinter,et al.  A Multi-Modal Natural Language Interface to an Information Visualization Environment , 2001, Int. J. Speech Technol..

[51]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[52]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[53]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[54]  Alex Endert,et al.  Augmenting Visualizations with Interactive Data Facts to Facilitate Interpretation and Communication , 2019, IEEE Transactions on Visualization and Computer Graphics.

[55]  Vidya Setlur,et al.  Applying Pragmatics Principles for Interaction with Visual Analytics , 2018, IEEE Transactions on Visualization and Computer Graphics.

[56]  Jillian Aurisano Articulate 2 : Toward a Conversational Interface for Visual Data Exploration , 2016 .

[57]  Seyed H. Roosta Parallel Search Algorithms , 2000 .

[58]  P. Ow,et al.  Filtered beam search in scheduling , 1988 .