The Potential and Limitations of Automatic Sentence Extraction for Summarization

In this paper we present an empirical study of the potential and limitation of sentence extraction in text summarization. Our results show that the single document generic summarization task as defined in DUC 2001 needs to be carefully refocused as reflected in the low inter-human agreement at 100-word1 (0.40 score) and high upper bound at full text2 (0.88) summaries. For 100-word summaries, the performance upper bound, 0.65, achieved oracle extracts3. Such oracle extracts show the promise of sentence extraction algorithms; however, we first need to raise inter-human agreement to be able to achieve this performance level. We show that compression is a promising direction and that the compression ratio of summaries affects average human and system performance.

[1]  Daniel Marcu,et al.  The automatic construction of large-scale corpora for summarization research , 1999, SIGIR '99.

[2]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[3]  Robert L. Donaway,et al.  A Comparison of Rankings Produced by Summarization Evaluation Measures , 2000 .

[4]  Eduard Hovy,et al.  Manual and automatic evaluation of summaries , 2002, ACL 2002.

[5]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[6]  Daniel Marcu,et al.  Discourse Trees Are Good Indicators of Importance in Text , 1999 .

[7]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[8]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[9]  Claire Cardie,et al.  Multidocument Summarization via Information Extraction , 2001, HLT.

[10]  David Evans,et al.  Tracking and summarizing news on a daily basis with Columbia's Newsblaster , 2002 .

[11]  Paul Over,et al.  Intrinsic Evaluation of Generic News Text Summarization Systems , 2003 .

[12]  Dragomir R. Radev,et al.  Generating Natural Language Summaries from Multiple On-Line Sources , 1998, CL.

[13]  Eduard Hovy,et al.  Automated Text Summarization in SUMMARIST , 1997, ACL 1997.

[14]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[15]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[16]  Jade Goldstein-Stewart,et al.  Summarizing text documents: sentence selection and evaluation metrics , 1999, SIGIR '99.