ChartText: Linking Text with Charts in Documents

Recent works show that interactive documents connecting text with visualizations facilitate reading comprehension. However, creating this type of content requires specialized knowledge. We present ChartText, a method that links text with visualizations in this work. Our approach supports documents that include bar charts, line charts, and scatter plots. ChartText receives the visual encoding of the visualization and its associated text as input. It then performs the linking in two stages: The matching stage creates individual links relating simple phrases between the text and the chart. Then, it combines the individual links according to the visual channels in the grouping stage, building more meaningful connections. We use two datasets to design and evaluate our method; the first comes from web documents (24 bar charts and texts) and the second from academic documents (25 bar charts, 25 line charts, and 25 scatter plots with their texts). Our experiments show that our method obtains F1 scores of 0.50 and 0.66 on both datasets. We can also use a semi-automatic approach correcting individual links; in this case, the scores rise to 0.68 and 0.84, respectively. To show the usefulness of our technique, we implement two proofs of concept. We create interactive documents using graphic overlays in the first one, facilitating the reading experience. We use voice instead of text to annotate charts in real-time in the second. For example, in a videoconference, our technique can automatically annotate a chart following the presenter’s description.

[1]  Dan Klein,et al.  Constituency Parsing with a Self-Attentive Encoder , 2018, ACL.

[2]  Nan Hua,et al.  Universal Sentence Encoder for English , 2018, EMNLP.

[3]  Maneesh Agrawala,et al.  Deconstructing and restyling D3 visualizations , 2014, UIST.

[4]  Maneesh Agrawala,et al.  Graphical Overlays: Using Layered Elements to Aid Chart Reading , 2012, IEEE Transactions on Visualization and Computer Graphics.

[5]  Maneesh Agrawala,et al.  Extracting references between text and charts via crowdsourcing , 2014, CHI.

[6]  Jeffrey Heer,et al.  Extracting Visual Encodings from Map Chart Images with Color-Encoded Scalar Values , 2018, 2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI).

[7]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[8]  Yun Wang,et al.  Text-to-Viz: Automatic Generation of Infographics from Proportion-Related Natural Language Statements , 2019, IEEE Transactions on Visualization and Computer Graphics.

[9]  María del Puy Pérez Echeverría,et al.  1. External representations as learning tools: an introduction , 2009 .

[10]  Arvind Satyanarayan,et al.  Vega-Lite: A Grammar of Interactive Graphics , 2018, IEEE Transactions on Visualization and Computer Graphics.

[11]  Ali Farhadi,et al.  FigureSeer: Parsing Result-Figures in Research Papers , 2016, ECCV.

[12]  Nicholas Diakopoulos,et al.  Contextifier: automatic generation of annotated stock visualizations , 2013, CHI.

[13]  Kwan-Liu Ma,et al.  Temporal Summary Images: An Approach to Narrative Visualization via Interactive Annotation Generation and Placement , 2017, IEEE Transactions on Visualization and Computer Graphics.

[14]  HeerJeffrey,et al.  D3 Data-Driven Documents , 2011 .

[15]  Jeffrey Heer,et al.  Reverse‐Engineering Visualizations: Recovering Visual Encodings from Chart Images , 2017, Comput. Graph. Forum.

[16]  It Informatika,et al.  Adobe Acrobat Reader , 2010 .

[17]  Fabian Beck,et al.  Authoring Combined Textual and Visual Descriptions of Graph Data , 2019, EuroVis.

[18]  David S. Rosenberg,et al.  Scatteract: Automated Extraction of Data from Scatter Plots , 2017, ECML/PKDD.

[19]  John Sweller,et al.  The Split-Attention Effect , 2011 .

[20]  Yong Wang,et al.  Towards Automated Infographic Design: Deep Learning-based Auto-Extraction of Extensible Timeline , 2019, IEEE Transactions on Visualization and Computer Graphics.

[21]  Jeffrey Heer,et al.  Narrative Visualization: Telling Stories with Data , 2010, IEEE Transactions on Visualization and Computer Graphics.

[22]  Jeffrey Heer,et al.  Extracting and Retargeting Color Mappings from Bitmap Images of Visualizations , 2018, IEEE Transactions on Visualization and Computer Graphics.

[23]  Niklas Elmqvist,et al.  Elastic Documents: Coupling Text and Tables through Contextual Visualizations for Enhanced Document Reading , 2019, IEEE Transactions on Visualization and Computer Graphics.

[24]  Christopher Andreas Clark,et al.  PDFFigures 2.0: Mining figures from research papers , 2016, 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL).

[25]  Xiaoru Yuan,et al.  Automatic Annotation Synchronizing with Textual Description for Visualization , 2020, CHI.

[26]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[27]  Fabian Beck,et al.  Exploring Interactive Linking Between Text and Visualization , 2018, EuroVis.

[28]  Walter J. Scheirer,et al.  Coupling Story to Visualization: Using Textual Analysis as a Bridge Between Data and Interpretation , 2017, IUI.

[29]  Jeffrey Heer,et al.  ReVision: automated classification, analysis and redesign of chart images , 2011, UIST.

[30]  Maneesh Agrawala,et al.  Facilitating Document Reading by Linking Text and Tables , 2018, UIST.

[31]  Bongshin Lee,et al.  ChartSense: Interactive Data Extraction from Chart Images , 2017, CHI.