Generating Indicative-Informative Summaries with SumUM

We present and evaluate SumUM, a text summarization system that takes a raw technical text as input and produces an indicative informative summary. The indicative part of the summary identifies the topics of the document, and the informative part elaborates on some of these topics according to the reader's interest. SumUM motivates the topics, describes entities, and defines concepts. It is a first step for exploring the issue of dynamic summarization. This is accomplished through a process of shallow syntactic and semantic analysis, concept identification, and text regeneration. Our method was developed through the study of a corpus of abstracts written by professional abstractors. Relying on human judgment, we have evaluated indicativeness, informativeness, and text acceptability of the automatic summaries. The results thus far indicate good performance when compared with other summarization technologies.

[1]  Frances C. Johnson Automatic abstracting research , 1995 .

[2]  Lisa F. Rau,et al.  Automatic Condensation of Electronic Publications by Sentence Selection , 1995, Inf. Process. Manag..

[3]  Anne H. Soukhanov Roget's II : the new thesaurus , 1988 .

[4]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[5]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[6]  Robert B. Kaplan,et al.  On abstract writing , 1994 .

[7]  Seiji Miike,et al.  Abstract Generation Based on Rhetorical Structure Extraction , 1994, COLING.

[8]  Karen Spärck Jones What Might be in a Summary? , 1993, Information Retrieval.

[9]  Simone Teufel,et al.  Meta-discourse markers and problem-structuring in scientific articles , 2002 .

[10]  John Tait,et al.  Automatic summarising of English texts , 1982 .

[11]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[12]  Dominique Estival,et al.  Karen Sparck Jones & Julia R. Galliers, Evaluating Natural Language Processing Systems: An Analysis and Review. Lecture Notes in Artificial Intelligence 1083 , 1998, Machine Translation.

[13]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[14]  Donia Scott,et al.  A Discourse Model for Gist Preservation , 1996, SBIA.

[15]  Kathleen R. McKeown,et al.  Summarization Evaluation Methods: Experiments and Analysis , 1998 .

[16]  Chris D. Paice,et al.  The identification of important concepts in highly structured technical papers , 1993, SIGIR.

[17]  Daniel Marcu,et al.  Statistics-Based Summarization - Step One: Sentence Compression , 2000, AAAI/IAAI.

[18]  Timothy Robin Gibson Towards a discourse theory of abstracts and abstracting , 1993 .

[19]  Horacio Saggion,et al.  Selective analysis for automatic abstracting: Evaluating Indicativeness and Acceptability , 2000, RIAO.

[20]  Chris D. Paice,et al.  The automatic generation of literature abstracts: an approach based on the identification of self-indicating phrases , 1980, SIGIR '80.

[21]  T. E. R. Singer,et al.  Abstracting scientific and technical literature;: An introductory guide and text for scientists, abstractors, and management , 1971 .

[22]  Inderjeet Mani,et al.  Improving Summaries by Revising Them , 1999, ACL.

[23]  D. Rumelhart NOTES ON A SCHEMA FOR STORIES , 1975 .

[24]  Bernadette Sharp Elaboration and testing of new methodologies for automatic abstracting , 1989 .

[25]  Inderjeet Mani,et al.  The Tipster Summac Text Summarization Evaluation , 1999, EACL.

[26]  Mario Bunge,et al.  Scientific Research I , 1967 .

[27]  Horacio Saggion,et al.  Summary Generation and Evaluation in SumUM , 2000, IBERAMIA-SBIA.

[28]  Karen Spärck Jones,et al.  Automatic Summarizing , 1995, Inf. Process. Manag..

[29]  Elizabeth Du,et al.  The discourse-level structure of empirical abstracts: an exploratory study , 1991, Inf. Process. Manag..

[30]  William J. Black,et al.  Knowledge-based abstracting , 1990 .

[31]  Dragomir R. Radev,et al.  Introduction to the Special Issue on Summarization , 2002, CL.

[32]  Lisa F. Rau,et al.  Information extraction and text summarization using linguistic knowledge acquisition , 1989, Inf. Process. Manag..

[33]  Karen Sparck Jones Document Processing 7.4 Summarization 7.4.1 Analytical Framework , 2022 .

[34]  John Greenwood,et al.  The Integration of Theory and Practice , 1979 .

[35]  Horacio Saggion,et al.  Concept Identification and Presentation in the Context of Technical Text Summarization , 2000 .

[36]  Ted Brandhorst ERIC Processing Manual. Rules and Guidelines for the Acquisition, Selection, and Technical Processing of Documents and Journal Articles by the Various Components of the ERIC Network. , 1980 .

[37]  Peter D. Turney Learning to Extract Keyphrases from Text , 2002, ArXiv.

[38]  W. Kintsch,et al.  Comment on se rappelle et on resume des histoires (How We Remember and Summarize Stories). , 1975 .

[39]  C. Paice,et al.  Term extraction for automatic abstracting , 1998 .

[40]  Hongyan Jing,et al.  Sentence Reduction for Automatic Text Summarization , 2000, ANLP.

[41]  Kathleen R. McKeown,et al.  Generating natural language summaries from multiple on-line sources , 1998 .

[42]  Jennifer Rowley,et al.  Abstracting and indexing , 1982 .

[43]  Karen Sparck Jones,et al.  Book Reviews: Evaluating Natural Language Processing Systems: An Analysis and Review , 1996, CL.

[44]  V. Dijk Recalling and Summarizing Complex Discourse , 1979 .

[45]  Gerard Salton,et al.  Automatic Text Structuring and Summarization , 1997, Inf. Process. Manag..

[46]  Mats Carlsson,et al.  SICStus Prolog User''s Manual , 1993 .

[47]  Richard Alterman,et al.  Some computational experiments in summarization , 1990 .

[48]  Kathleen McKeown,et al.  Cut and Paste Based Text Summarization , 2000, ANLP.

[49]  Michael Oakes,et al.  Statistics for Corpus Linguistics , 1998 .

[50]  M. Bunge,et al.  Scientific Research I: The Search for System , 1967 .

[51]  Udo Hahn,et al.  Topic parsing: Accounting for text macro structures in full-text analysis , 1990, Inf. Process. Manag..

[52]  Mark Sanderson,et al.  Advantages of query biased summaries in information retrieval , 1998, SIGIR '98.

[53]  Helen R. Tibbo The art of abstracting , 1997 .

[54]  Wendy G. Lehnert,et al.  Plot Units and Narrative Summarization , 1981, Cogn. Sci..

[55]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .

[56]  Donald R Byrkit,et al.  Statistics today: a comprehensive introduction , 1986 .

[57]  C. D. Paice,et al.  A ‘Select and Generate’ Approach to Automatic Abstracting , 1993 .

[58]  Marc Moens,et al.  Sentence extraction and rhetorical classification for flexible abstracts , 1998 .

[59]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[60]  Michael P. Oakes,et al.  The Automatic Generation of Templates for Automatic Abstracting , 1999, BCS-IRSG Annual Colloquium on IR Research.

[61]  Michael P. Jordan Openings in Very Formal Technical Texts , 1993 .

[62]  Larry Wall,et al.  Programming Perl , 1991 .

[63]  R.E. Filman,et al.  Searching the Internet , 1998, IEEE Internet Computing.

[64]  Frances C. Johnson,et al.  The application of linguistic processing to automatic abstract generation , 1997 .