Multi-Document Summarization By Sentence Extraction

This paper discusses a text extraction approach to multi-document summarization that builds on single-document summarization methods by using additional, available information about the document set as a whole and the relationships between the documents. Multi-document summarization differs from single in that the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. Our approach addresses these issues by using domain-independent techniques based mainly on fast, statistical processing, a metric for reducing redundancy and maximizing diversity in the selected passages, and a modular framework to allow easy parameterization for different genres, corpora characteristics and user requirements.

[1]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[2]  Gerard Salton,et al.  Automatic Processing of Foreign Language Documents , 1969, COLING.

[3]  Chris Buckley,et al.  Implementation of the SMART Information Retrieval System , 1985 .

[4]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[5]  Chris D. Paice,et al.  Constructing literature abstracts by computer: Techniques and prospects , 1990, Inf. Process. Manag..

[6]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[7]  James Shaw Conciseness through Aggregation in Text Generation , 1995, ACL.

[8]  Kathleen McKeown,et al.  Empirically Designing and Evaluating a New Revision-Based Model for Summary Generation , 1996, Artif. Intell..

[9]  Mary Ellen Okurowski,et al.  A Scalable Summarization System Using Robust NLP , 1997 .

[10]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .

[11]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[12]  Branimir K. Boguraev,et al.  Salience-based Content Characterisafion of Text Documents , 1997 .

[13]  Inderjeet Mani,et al.  Multi-Document Summarization by Graph Search and Matching , 1997, AAAI/IAAI.

[14]  Eduard Hovy,et al.  Automated Text Summarization in SUMMARIST , 1997, ACL 1997.

[15]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[16]  Chris Buckley,et al.  Automatic Text Summarization by Paragraph Extraction , 1997 .

[17]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[18]  Breck Baldwin,et al.  Dynamic Coreference-Based Summarization , 1998, EMNLP.

[19]  Kathleen R. McKeown,et al.  Generating natural language summaries from multiple on-line sources , 1998 .

[20]  Tomek Strzalkowski,et al.  A Robust Practical Text Summarization , 1998 .

[21]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[22]  Eduard H. Hovy,et al.  Automated Text Summarization and the SUMMARIST System , 1998, TIPSTER.

[23]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[24]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.

[25]  Jaime G. Carbonell,et al.  The Use of MMR and Diversity-Based Reranking in Document Reranking and Summarization , 1998 .

[26]  Yiming Yang,et al.  Learning Approaches to Topic Detection and Tracking , 1999 .

[27]  Jade Goldstein-Stewart,et al.  Summarizing text documents: sentence selection and evaluation metrics , 1999, SIGIR '99.

[28]  Yiming Yang,et al.  Learning approaches for detecting and tracking news events , 1999, IEEE Intell. Syst..

[29]  Summarizing Multiple Documents using Text Extraction and Interactive Clustering , 1999 .

[30]  Regina Barzilay,et al.  Towards Multidocument Summarization by Reformulation: Progress and Prospects , 1999, AAAI/IAAI.