Chart-to-Text: Generating Natural Language Descriptions for Charts by Adapting the Transformer Model

Information visualizations such as bar charts and line charts are very popular for exploring data and communicating insights. Interpreting and making sense of such visualizations can be challenging for some people, such as those who are visually impaired or have low visualization literacy. In this work, we introduce a new dataset and present a neural model for automatically generating natural language summaries for charts. The generated summaries provide an interpretation of the chart and convey the key insights found within that chart. Our neural model is developed by extending the state-of-the-art model for the data-to-text generation task, which utilizes a transformer-based encoder-decoder architecture. We found that our approach outperforms the base model on a content selection metric by a wide margin (55.42% vs. 8.49%) and generates more informative, concise, and coherent summaries.

[1]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.

[4]  Cristina Conati,et al.  User Task Adaptation in Multimedia Presentations , 2013, UMAP Workshops.

[5]  Johanna D. Moore,et al.  AutoBrief: an experimental system for the automatic generation of briefings in integrated text and information graphics , 2004, Int. J. Hum. Comput. Stud..

[6]  Diyi Yang,et al.  ToTTo: A Controlled Table-To-Text Generation Dataset , 2020, EMNLP.

[7]  Gitte Lindgaard,et al.  Evaluating a tool for improving accessibility to charts and graphs , 2010, ASSETS '10.

[8]  M. S. Morgan,et al.  Narrative Science , 2022 .

[9]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[10]  Wang Ling,et al.  Reference-Aware Language Models , 2016, EMNLP.

[11]  Razvan C. Bunescu,et al.  Figure Captioning with Reasoning and Sequence-Level Training , 2019, ArXiv.

[12]  Wenhu Chen,et al.  Logical Natural Language Generation from Open-Domain Tables , 2020, ACL.

[13]  Niklas Elmqvist,et al.  DataSite: Proactive visual data exploration with computation of insight-based recommendations , 2018, Inf. Vis..

[14]  David Grangier,et al.  Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[15]  Johanna D. Moore,et al.  Describing Complex Charts in Natural Language: A Caption Generation System , 1998, CL.

[16]  Alex Endert,et al.  Augmenting Visualizations with Interactive Data Facts to Facilitate Interpretation and Communication , 2019, IEEE Transactions on Visualization and Computer Graphics.

[17]  Dan Klein,et al.  Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[18]  Mirella Lapata,et al.  Data-to-Text Generation with Content Selection and Planning , 2018, AAAI.

[19]  Yun Wang,et al.  DataShot: Automatic Generation of Fact Sheets from Tabular Data , 2020, IEEE Transactions on Visualization and Computer Graphics.

[20]  Maneesh Agrawala,et al.  Answering Questions about Charts and Generating Visual Explanations , 2020, CHI.

[21]  Kevin Gimpel,et al.  Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units , 2016, ArXiv.

[22]  Guy Lapalme,et al.  Intentions in the Coordinated Generation of Graphics and Text from Tabular Data , 2000, Knowledge and Information Systems.

[23]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[24]  Mirella Lapata,et al.  Collective Content Selection for Concept-to-Text Generation , 2005, HLT.

[25]  Li Gong,et al.  Enhanced Transformer Model for Data-to-Text Generation , 2019, EMNLP.

[26]  Guy Lapalme,et al.  PostGraphe: A System for the Generation of Statistical Graphics and Text , 1996, INLG.

[27]  Jim Hunter,et al.  Choosing words in computer-generated weather forecasts , 2005, Artif. Intell..

[28]  Ehud Reiter,et al.  An Architecture for Data-to-Text Systems , 2007, ENLG.

[29]  Stephanie Elzer Schwartz,et al.  Information graphics: an untapped resource for digital libraries , 2006, SIGIR.

[30]  Kathleen F. McCoy,et al.  Summarizing Information Graphics Textually , 2012, CL.

[31]  Matthew R. Walter,et al.  What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment , 2015, NAACL.