Summarizing Information Graphics Textually

Information graphics (such as bar charts and line graphs) play a vital role in many multimodal documents. The majority of information graphics that appear in popular media are intended to convey a message and the graphic designer uses deliberate communicative signals, such as highlighting certain aspects of the graphic, in order to bring that message out. The graphic, whose communicative goal (intended message) is often not captured by the document's accompanying text, contributes to the overall purpose of the document and cannot be ignored. This article presents our approach to providing the high-level content of a non-scientific information graphic via a brief textual summary which includes the intended message and the salient features of the graphic. This work brings together insights obtained from empirical studies in order to determine what should be contained in the summaries of this form of non-linguistic input data, and how the information required for realizing the selected content can be extracted from the visual image and the textual components of the graphic. This work also presents a novel bottom–up generation approach to simultaneously construct the discourse and sentence structures of textual summaries by leveraging different discourse related considerations such as the syntactic complexity of realized sentences and clause embeddings. The effectiveness of our work was validated by different evaluation studies.

[1]  Albert Gatt,et al.  The GREC Challenge 2008: Overview and Evaluation Results , 2008, INLG.

[2]  Johanna D. Moore,et al.  Planning Text for Advisory Dialogues: Capturing Intentional and Rhetorical Information , 1993, CL.

[3]  Victor H. Yngve,et al.  A model and an hypothesis for language structure , 1960 .

[4]  Marilyn A. Walker,et al.  Trainable Sentence Planning for Complex Information Presentations in Spoken Dialog Systems , 2004, ACL.

[5]  Richard Power,et al.  Optimizing Referential Coherence in Text Generation , 2004, CL.

[6]  Michael Elhadad,et al.  Surge: A com-prehensive plug-in syntactic realisation component for text generation , 1997 .

[7]  Kathleen F. McCoy,et al.  RAFT/RAPR and Centering: a comparison and discussion of problems related to processing complex sentences , 1994, CL.

[8]  K. McKeown,et al.  Discourse Strategies for Generating Natural-Language Text , 1985, Artif. Intell..

[9]  Siobhan Chapman Logic and Conversation , 2005 .

[10]  Robert Dale,et al.  Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions , 1995, Cogn. Sci..

[11]  Michael Friendly,et al.  A Brief History of Data Visualization , 2008 .

[12]  Nancy Green,et al.  A Model of Perceptual Task Effort for Bar Charts and its Role in Recognizing Intention , 2006, User Modeling and User-Adapted Interaction.

[13]  M. Covington,et al.  HOW COMPLEX IS THAT SENTENCE? A PROPOSED REVISION OF THE ROSENBERG AND ABBEDUTO D-LEVEL SCALE , 2006 .

[14]  Jim Hunter,et al.  Automatic Generation of Textual Summaries from Neonatal Intensive Care Data , 2007, AIME.

[15]  Breck Baldwin,et al.  Dynamic Coreference-Based Summarization , 1998, EMNLP.

[16]  Kathleen F. McCoy,et al.  Focus of attention: Constraining what can be said next , 1991 .

[17]  付伶俐 打磨Using Language,倡导新理念 , 2014 .

[18]  Eduard H. Hovy,et al.  Planning Coherent Multisentential Text , 1988, ACL.

[19]  Sandra Carberry,et al.  Issues in realizing the overall message of a bar chart , 2009 .

[20]  Ani Nenkova,et al.  References to Named Entities: a Corpus Study , 2003, HLT-NAACL.

[21]  James Shaw Clause Aggregation Using Linguistic Knowledge , 1998, INLG.

[22]  Kathleen F. McCoy,et al.  A Corpus of Human-written Summaries of Line Graphs , 2011 .

[23]  Stephen A. Brewster,et al.  Constructing sonified haptic line graphs for the blind student: first steps , 2000, Assets '00.

[24]  C. Mellish,et al.  ILEX: an architecture for a dynamic hypertext generation system , 2001, Natural Language Engineering.

[25]  Ehud Reiter,et al.  Generating Approximate Geographic Descriptions , 2009, ENLG.

[26]  Steven F. Roth,et al.  Mapping communicative goals into conceptual tasks to generate graphics in discourse , 2001, Knowl. Based Syst..

[27]  Eduard H. Hovy,et al.  Automated Text Summarization and the SUMMARIST System , 1998, TIPSTER.

[28]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[29]  Davide Fossati,et al.  Aggregation Improves Learning: Experiments in Natural Language Generation for Intelligent Tutoring Systems , 2005, ACL.

[30]  Marilyn A. Walker,et al.  Individual and Domain Adaptation in Sentence Planning for Dialogue , 2007, J. Artif. Intell. Res..

[31]  Jlfnm Fpoli,et al.  Training a Sentence Planner for Spoken Dialogue Using Boosting , 2002 .

[32]  Ingrid Zukerman,et al.  The automated understanding of simple bar charts , 2011, Artif. Intell..

[33]  William C. Mann,et al.  RHETORICAL STRUCTURE THEORY: A THEORY OF TEXT ORGANIZATION , 1987 .

[34]  Robert Dale,et al.  Building Natural Language Generation Systems: Figures , 2000 .

[35]  Fu Lee Wang,et al.  Multi-document Summarization for E-Learning , 2009, ICHL.

[36]  Peter B. L. Meijer,et al.  An experimental system for auditory image representations , 1992, IEEE Transactions on Biomedical Engineering.

[37]  Regina Barzilay,et al.  Inferring Strategies for Sentence Ordering in Multidocument News Summarization , 2002, J. Artif. Intell. Res..

[38]  James C. Lester,et al.  Developing and Empirically Evaluating Robust Explanation Generators: The KNIGHT Experiments , 1997, Comput. Linguistics.

[39]  Carl Pollard,et al.  A Centering Approach to Pronouns , 1987, ACL.

[40]  Satoshi Ina Computer graphics for the blind , 1996, SIGC.

[41]  Mark Johnson Proof Nets and the Complexity of Processing Center Embedded Constructions , 1998, J. Log. Lang. Inf..

[42]  Albert Gatt,et al.  From data to text in the Neonatal Intensive Care Unit: Using NLG technology for decision support and information management , 2009, AI Commun..

[43]  Nancy Green,et al.  Towards generating textual summaries of graphs , 2001, HCI.

[44]  Andrea R. Kennel Audiograf: a diagram-reader for the blind , 1996, Assets '96.

[45]  Kees van Deemter,et al.  Information sharing : reference and presupposition in language generation and interpretation , 2002 .

[46]  Emiel Krahmer,et al.  Graph-Based Generation of Referring Expressions , 2003, CL.

[47]  Guy Lapalme,et al.  Intentions in the Coordinated Generation of Graphics and Text from Tabular Data , 2000, Knowledge and Information Systems.

[48]  Bonnie L. Webber,et al.  Living Up to Expectations: Computing Expert Responses , 1984, HLT.

[49]  Cécile Paris,et al.  Tailoring Object Descriptions to a User's Level of Expertise , 1988, Comput. Linguistics.

[50]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[51]  Dekang Lin On the Structural Complexity of Natural Language Sentences , 1996, COLING.

[52]  Peng Wu,et al.  Recognizing the Intended Message of Line Graphs , 2010, Diagrams.

[53]  José Coch System Demonstration Interactive Generation And Knowledge Administration In Multimeteo , 1998, INLG.

[54]  Chris Mellish,et al.  Experiments Using Stochastic Search for Text Planning , 1998, INLG.

[55]  Chris Mellish,et al.  Evaluating Centering for Information Ordering Using Corpora , 2009, CL.

[56]  Mary Ellen Foster,et al.  Automatically generating text to accompany information graphics , 1999 .

[57]  Eduard H. Hovy,et al.  Automated Discourse Generation Using Discourse Structure Relations , 1993, Artif. Intell..

[58]  Jaime Carbonell,et al.  Multi-Document Summarization By Sentence Extraction , 2000 .

[59]  Evan Kidd,et al.  English-Speaking Children's Comprehension of Relative Clauses: Evidence for General-Cognitive and Language-Specific Constraints on Development , 2002, Journal of psycholinguistic research.

[60]  Aaron Allen,et al.  What Frustrates Screen Reader Users on the Web: A Study of 100 Blind Users , 2007, Int. J. Hum. Comput. Interact..

[61]  Richard E. Ladner,et al.  Automated tactile graphics translation: in the field , 2007, Assets '07.

[62]  Ehud Reiter,et al.  SumTime-Mousam: Configurable marine weather forecast generator , 2003 .

[63]  Emiel Krahmer,et al.  Efficient context-sensitive generation of referring expressions , 2002 .

[64]  Shimei Pan,et al.  Language Generation for Multimedia Healthcare Briefings , 1997, ANLP.

[65]  Kathleen R. McKeown,et al.  Experiments in multidocument summarization , 2002 .

[66]  M. Corio,et al.  Generation of texts for information graphics , 1999 .

[67]  Daniel L. Chester,et al.  Getting Computers to See Information Graphics So Users Do Not Have to , 2005, ISMIS.

[68]  Yorick Wilks,et al.  Extracting relational facts for indexing and retrieval of crime-scene photographs , 2003, Knowl. Based Syst..

[69]  James L. Alty,et al.  Communicating graphical information to blind users using music: the role of context , 1998, CHI.

[70]  Richard I. Kittredge,et al.  Using natural-language processing to produce weather forecasts , 1994, IEEE Expert.

[71]  Ehud Reiter,et al.  An Architecture for Data-to-Text Systems , 2007, ENLG.

[72]  Stephanie Elzer Schwartz,et al.  Information graphics: an untapped resource for digital libraries , 2006, SIGIR.

[73]  Gitte Lindgaard,et al.  Improving accessibility to statistical graphs: the iGraph-Lite system , 2007, Assets '07.

[74]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[75]  Karen Kukich,et al.  Design of a Knowledge-Based Report Generator , 1983, ACL.

[76]  Daniel Marcu,et al.  The rhetorical parsing, summarization, and generation of natural language texts , 1998 .

[77]  Benoit Lavoie,et al.  A Fast and Portable Realizer for Text Generation Systems , 1997, ANLP.

[78]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[79]  Anton Hägerstrand Multi Document Summarization. , 2011 .

[80]  Chris Mellish,et al.  Choosing the content of textual summaries of large time-series data sets , 2006, Natural Language Engineering.

[81]  Mirella Lapata,et al.  Aggregation via Set Partitioning for Natural Language Generation , 2006, NAACL.

[82]  Kathleen F. McCoy,et al.  Interactive SIGHT: textual access to simple bar charts , 2010, New Rev. Hypermedia Multim..

[83]  Albert Gatt,et al.  The TUNA-REG Challenge 2009: Overview and Evaluation Results , 2009, ENLG.

[84]  Johanna D. Moore,et al.  Planning Text for Advisory Dialogues , 1989, ACL.

[85]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[86]  Hercules Dalianis,et al.  Aggregation in Natural Language Generation , 1999 .