Event-Based Extractive Summarization

Most approaches to extractive summarization define a set of features upon which selection of sentences is based, using algorithms independent of the features themselves. We propose a new set of features based on low-level, atomic events that describe relationships between important actors in a document or set of documents. We investigate the effect this new feature has on extractive summarization, compared with a baseline feature set consisting of the words in the input documents, and with state-of-the-art summarization systems. Our experimental results indicate that not only the event-based features offer an improvement in summary quality over words as features, but that this effect is more pronounced for more sophisticated summarization methods that avoid redundancy in the output.

[1]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[2]  Vasileios Hatzivassiloglou,et al.  Domain -independent detection, extraction, and labeling of Atomic Events , 2003 .

[3]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[4]  Jade Goldstein-Stewart,et al.  Creating and evaluating multi-document sentence extract summaries , 2000, CIKM '00.

[5]  Phyllis B. Baxendale,et al.  Machine-Made Index for Technical Literature - An Experiment , 1958, IBM J. Res. Dev..

[6]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[7]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[8]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[9]  Vasileios Hatzivassiloglou,et al.  A Formal Model for Information Selection in Multi-Sentence Text Extraction , 2004, COLING.

[10]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[11]  D. Hochbaum Approximating covering and packing problems: set cover, vertex cover, independent set, and related problems , 1996 .

[12]  Elaine Marsh,et al.  MUC-7 Evaluation of IE Technology: Overview of Results , 1998, MUC.

[13]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[14]  Eduard H. Hovy,et al.  Identifying Topics by Position , 1997, ANLP.

[15]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .

[16]  Regina Barzilay,et al.  Towards Multidocument Summarization by Reformulation: Progress and Prospects , 1999, AAAI/IAAI.

[17]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[18]  Eduard H. Hovy,et al.  The Automated Acquisition of Topic Signatures for Text Summarization , 2000, COLING.