FROM EXTRACTS TO ABSTRACTS: HUMAN SUMMARY PRODUCTION OPERATIONS FOR COMPUTER-AIDED SUMMARISATION

This paper presents a classification and evaluation of human summary production operations used to transform extracts into more concise, coherent and readable abstracts. Computeraided summarisation (CAS) allows a user to post-edit an automatically produced extract to improve it. However, unlike other areas of summarisation, no guidance is available to users of CAS systems to help them complete their task. The research reported here addresses this by examining linguistic operations used by a human summariser to transform extracts into abstracts. An evaluation proves that the operations are useful; they do improve coherence when applied to extracts.

[1]  Robert L. Donaway,et al.  A Comparison of Rankings Produced by Summarization Evaluation Measures , 2000 .

[2]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[3]  V. Dijk Recalling and Summarizing Complex Discourse , 1979 .

[4]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[5]  Timothy C. Craven An Experiment in the Use of Tools for Computer-Assisted Abstracting. , 1996 .

[6]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[7]  R. Gunning The Technique of Clear Writing. , 1968 .

[8]  Rodger Kibble A Reformulation of Rule 2 of Centering Theory , 2001, Computational Linguistics.

[9]  Mirella Lapata,et al.  Automatic Evaluation of Text Coherence: Models and Representations , 2005, IJCAI.

[10]  Constantin Orasan,et al.  PALinkA: A highly customisable tool for discourse annotation , 2003, SIGDIAL Workshop.

[11]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[12]  Udo Hahn,et al.  Functional Centering - Grounding Referential Coherence in Information Structure , 1999, Comput. Linguistics.

[13]  Chin-Yew Lin,et al.  Automated Text Summarization , 2005, IJCNLP.

[14]  Nikiforos Karamanis,et al.  Entity coherence for descriptive text structuring , 2004 .

[15]  Gerard Salton,et al.  Automatic Text Structuring and Summarization , 1997, Inf. Process. Manag..

[16]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[17]  Sergei Nirenburg,et al.  The Proper Place of Men and Machines in Language Translation , 2003 .

[18]  Seiji Miike,et al.  A full-text retrieval system with a dynamic abstract generation function , 1994, SIGIR '94.

[19]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[20]  Udo Hahn,et al.  Text condensation as knowledge base abstraction , 1988, [1988] Proceedings. The Fourth Conference on Artificial Intelligence Applications.

[21]  John M. Swales,et al.  Genre Analysis: English in Academic and Research Settings , 1993 .

[22]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[23]  L. Hasler An investigation into the use of Centering transitions for summarisation , 2003 .

[24]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[25]  Edith Bolling Anaphora Resolution , 2006 .

[26]  Timothy C. Craven Human creation of abstracts with selected computer assistance tools , 1998, Inf. Res..

[27]  Kathleen R. McKeown,et al.  Generating natural language summaries from multiple on-line sources , 1998 .

[28]  Constantin Orasan,et al.  CAST: A computer-aided summarisation tool , 2003, EACL.

[29]  D. Crystal The Cambridge Encyclopedia of the English Language , 1998 .

[30]  Kathleen McKeown,et al.  Cut and Paste Based Text Summarization , 2000, ANLP.

[31]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[32]  Anke Lüdeling,et al.  Corpus Linguistics: An International Handbook , 2009 .

[33]  Giovanni Guida,et al.  Forward And Backward Reasoning In Automatic Abstracting , 1982, COLING.

[34]  Timothy C. Craven Abstracts produced using computer assistance , 2000 .

[35]  Seiji Miike,et al.  Abstract Generation Based on Rhetorical Structure Extraction , 1994, COLING.

[36]  Geoffrey Sampson,et al.  The Oxford Handbook of Computational Linguistics , 2003, Lit. Linguistic Comput..

[37]  Timothy C. Craven A Computer-Aided Abstracting Tool Kit. , 1993 .

[38]  Jennifer Rowley,et al.  Abstracting and indexing , 1982 .

[39]  Jan Svartvik,et al.  A __ comprehensive grammar of the English language , 1988 .

[40]  Karen Sparck Jones,et al.  Book Reviews: Evaluating Natural Language Processing Systems: An Analysis and Review , 1996, CL.

[41]  Phyllis B. Baxendale,et al.  Machine-Made Index for Technical Literature - An Experiment , 1958, IBM J. Res. Dev..

[42]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[43]  Megumi Kameyama,et al.  Intrasentential Centering: A Case Study , 1997, ArXiv.

[44]  Daniel Marcu,et al.  The rhetorical parsing, summarization, and generation of natural language texts , 1998 .

[45]  Constantin Orăsan,et al.  An Evolutionary Approach for Improving the Quality of Automatic Summaries , 2003, Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering -.

[46]  Lisa F. Rau,et al.  Information extraction and text summarization using linguistic knowledge acquisition , 1989, Inf. Process. Manag..

[47]  Constantin Orasan Patterns in Scientific Abstracts , 2001 .

[48]  R. Kibble ITRI-99-19 Using centering theory to plan coherent texts , 1999 .

[49]  Harold Borko,et al.  Abstracting Concepts and Methods , 1975 .

[50]  Masumi Narita Constructing a Tagged E-J Parallel Corpus for Assisting Japanese Software Engineers in Writing English Abstracts , 2000, LREC.

[51]  T. V. Dijk News as Discourse , 1990 .

[52]  Barbara Di Eugenio,et al.  Centering: A Parametric Theory and Its Instantiations , 2004, Computational Linguistics.

[53]  Carlos Martín-Vide Current issues in mathematical linguistics , 1994 .

[54]  R. Kibble Cb or not Cb? Centering theory applied to NLG , 1999 .

[55]  Constantin Orasan,et al.  Building better corpora for summarisation , 2003 .

[56]  Wendy G. Lehnert,et al.  Strategies for Natural Language Processing , 1982 .

[57]  Harold L. Somers,et al.  An introduction to machine translation , 1992 .

[58]  Kathleen F. McCoy,et al.  RAFT/RAPR and Centering: a comparison and discussion of problems related to processing complex sentences , 1994, CL.

[59]  Michele Banko,et al.  Headline Generation Based on Statistical Translation , 2000, ACL.

[60]  Kathleen McKeown,et al.  The decomposition of human-written summary sentences , 1999, SIGIR '99.

[61]  Helen R. Tibbo The art of abstracting , 1997 .

[62]  Donald B. Cleveland,et al.  Introduction to the indexing and abstracting , 1982 .

[63]  Lisa F. Rau,et al.  Automatic Condensation of Electronic Publications by Sentence Selection , 1995, Inf. Process. Manag..

[64]  Michael Hoey,et al.  Patterns of Lexis In Text , 1991 .

[65]  Inderjeet Mani,et al.  Improving Summaries by Revising Them , 1999, ACL.

[66]  Chris Mellish,et al.  Evaluating Centering-Based Metrics of Coherence , 2004, ACL.

[67]  W. Kintsch The representation of meaning in memory , 1974 .

[68]  Dou Shen Text Summarization , 2009, Encyclopedia of Database Systems.

[69]  Antonio Zamora,et al.  Automatic Abstracting Research at Chemical Abstracts Service , 1975, J. Chem. Inf. Comput. Sci..

[70]  Brigitte Endres-Niggemeyer,et al.  Summarizing information , 1998 .

[71]  Inderjeet Mani,et al.  The Tipster Summac Text Summarization Evaluation , 1999, EACL.

[72]  Jean-Pierre Desclés,et al.  Knowledge-Based Automatic Abstracting: Experiments in the Sublanguage of Elementary Geometry , 1994 .

[73]  L. Hasler,et al.  Computer-aided summarisation : How much does it really help ? , 2007 .

[74]  Richard Tucker,et al.  Automatic summarising and the CLASP system , 2000 .

[75]  Constantin Orasan,et al.  A Comparison of Summarisation Methods Based on Term Specificity Estimation , 2004, LREC.

[76]  Marilyn A. Walker,et al.  Centering, Anaphora Resolution, and Discourse Structure , 1997, ArXiv.

[77]  Jean-Luc Minel,et al.  How to Appreciate the Quality of Automatic Text Summarization? Examples of FAN and MLUCE Protocols and their Results on SERAPHIN , 1997, ACL 1997.

[78]  Jaime Carbonell,et al.  Multi-Document Summarization By Sentence Extraction , 2000 .

[79]  Nikiforos Karamanis,et al.  Stochastic Text Structuring Using the Principle of Continuity , 2002, INLG.

[80]  Takehito Utsuro,et al.  A Web-based English Abstract Writing Tool Using a Tagged E-J Parallel Corpus , 2002, LREC.

[81]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[82]  Kathleen McKeown,et al.  Generating Concise Natural Language Summaries , 1995, Inf. Process. Manag..

[83]  George M. Kasper,et al.  The Effects and Limitations of Automated Text Condensing on Reading Comprehension Performance , 1992, Inf. Syst. Res..

[84]  M. Walker,et al.  Centering in Naturally-Occurring Discourse: An Overview , 2007 .

[85]  Laura Hasler "Why do you Ignore me?" - Proof that not all Direct Speech is Bad , 2004, LREC.

[86]  Marc Moens,et al.  Argumentative Classification of Extracted Sentences as a First Step Towards Flexible Abstracting , 1999 .

[87]  Chris D. Paice,et al.  The identification of important concepts in highly structured technical papers , 1993, SIGIR.

[88]  Kathleen R. McKeown,et al.  Information fusion for multidocument summarization: paraphrasing and generation , 2003 .

[89]  R. Mitkov,et al.  Computer-Aided Generation of Multiple-Choice Tests , 2003, International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003.

[90]  Ruslan Mitkov,et al.  The Oxford handbook of computational linguistics , 2003 .

[91]  Eduard Hovy,et al.  Automated Text Summarization in SUMMARIST , 1997, ACL 1997.

[92]  MARTIN KAY The Proper Place of Men and Machines in Language Translation , 2004, Machine Translation.

[93]  Constantin Orasan,et al.  Computer-aided summarisation – what the user really wants , 2006, LREC.

[94]  Ani Nenkova,et al.  Automation of Summary Evaluation by the Pyramid Method , 2005 .

[95]  Richard Power,et al.  An integrated framework for text planning and pronominalisation , 2000, INLG.

[96]  Walter Kintsch,et al.  Toward a model of text comprehension and production. , 1978 .

[97]  Inderjeet Mani,et al.  Summarizing Similarities and Differences Among Related Documents , 1997, Information Retrieval.

[98]  Kees van Deemter,et al.  On Coreferring: Coreference in MUC and Related Annotation Schemes , 2000, CL.

[99]  Kathleen R. McKeown,et al.  Cut-and-paste text summarization , 2002 .

[100]  Frances C. Johnson,et al.  The application of linguistic processing to automatic abstract generation , 1997 .

[101]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[102]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[103]  Karen Spärck Jones Automatic summarising: factors and directions , 1998, ArXiv.

[104]  Mark T. Maybury,et al.  Advances in Automatic Text Summarization , 1999 .

[105]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[107]  M. Walker,et al.  Centering Theory in Discourse , 1998 .

[108]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[109]  Mark Stevenson,et al.  The Reuters Corpus Volume 1 -from Yesterday’s News to Tomorrow’s Language Resources , 2002, LREC.

[110]  Horacio Saggion,et al.  Concept Identification and Presentation in the Context of Technical Text Summarization , 2000 .

[111]  Carl Pollard,et al.  A Centering Approach to Pronouns , 1987, ACL.

[112]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[113]  E. F. Skorochod'ko Adaptive Method of Automatic Abstracting and Indexing , 1971, IFIP Congress.

[114]  Manabu Okumura,et al.  Producing More Readable Extracts by Revising Them , 1999, COLING.

[115]  Eduard H. Hovy,et al.  Identifying Topics by Position , 1997, ANLP.

[116]  Barbara J. Grosz,et al.  Pronouns, Names, and the Centering of Attention in Discourse , 1993, Cogn. Sci..

[117]  Elizabeth Du,et al.  The discourse-level structure of empirical abstracts: an exploratory study , 1991, Inf. Process. Manag..

[118]  Robert N. Oddy,et al.  Information Retrieval Research , 1982 .

[119]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[120]  Daniel Marcu,et al.  Discourse Trees Are Good Indicators of Importance in Text , 1999 .

[121]  María Pinto Molina Documentary abstracting: toward a methodological model , 1995 .