Sentence Ordering in Multidocument Summarization

The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. In this paper, we describe two naive ordering techniques and show that they do not perform well. We present an integrated strategy for ordering information, combining constraints from chronological order of events and cohesion. This strategy was derived from empirical observations based on experiments asking humans to order information. Evaluation of our augmented algorithm shows a significant improvement of the ordering over the two naive techniques we used as baseline.

[1]  Inderjeet Mani,et al.  Robust Temporal Processing of News , 2000, ACL.

[2]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[3]  Kathleen F. McCoy,et al.  Focus of attention: Constraining what can be said next , 1991 .

[4]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[5]  Kathleen McKeown,et al.  Text generation: using discourse strategies and focus constraints to generate natural language text , 1985 .

[6]  Kathleen R. McKeown,et al.  Generating natural language summaries from multiple on-line sources , 1998 .

[7]  A. Stuart,et al.  Non-Parametric Statistics for the Behavioral Sciences. , 1957 .

[8]  Inderjeet Mani,et al.  Multi-Document Summarization by Graph Search and Matching , 1997, AAAI/IAAI.

[9]  Robert Dale Generating referring expressions - constructing descriptions in a domain of objects and processes , 1992, ACL-MIT press series in natural language processing.

[10]  Kathleen F. McCoy,et al.  The Generation of High-Level Structure for Extended Explanations , 1990, COLING.

[11]  Eduard H. Hovy,et al.  Automated Discourse Generation Using Discourse Structure Relations , 1993, Artif. Intell..

[12]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[13]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[14]  Johanna D. Moore,et al.  Planning Text for Advisory Dialogues: Capturing Intentional and Rhetorical Information , 1993, CL.

[15]  Eleazar Eskin,et al.  Detecting Text Similarity over Short Passages: Exploring Linguistic Feature Combinations via Machine Learning , 1999, EMNLP.

[16]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[17]  Kathleen R. McKeown,et al.  Towards generating patient specific summaries of medical articles , 2001 .

[18]  Regina Barzilay,et al.  Towards Multidocument Summarization by Reformulation: Progress and Prospects , 1999, AAAI/IAAI.

[19]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.

[20]  Donia Scott,et al.  Can text structure be incompatible with rhetorical structure? , 2000, INLG.