Inferring Strategies for Sentence Ordering in Multidocument News Summarization

The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document summarization can be determined from the ordering of sentences in the input article, this is not the case for multidocument summarization where summary sentences may be drawn from different input articles. In this paper, we propose a methodology for studying the properties of ordering information in the news genre and describe experiments done on a corpus of multiple acceptable orderings we developed for the task. Based on these experiments, we implemented a strategy for ordering information that combines constraints from chronological order of events and topical relatedness. Evaluation of our augmented algorithm shows a significant improvement of the ordering over two baseline strategies.

[1]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[2]  Johanna D. Moore,et al.  Planning Text for Advisory Dialogues: Capturing Intentional and Rhetorical Information , 1993, CL.

[3]  Kathleen R. McKeown,et al.  Towards generating patient specific summaries of medical articles , 2001 .

[4]  Michael Halliday,et al.  Cohesion in English , 1976 .

[5]  Kathleen McKeown,et al.  Empirically Estimating Order Constraints for Content Planning in Generation , 2001, ACL.

[6]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.

[7]  Eduard Hovy,et al.  NEATS: A Multidocument Summarizer , 2001 .

[8]  Eduard H. Hovy,et al.  Automated Discourse Generation Using Discourse Structure Relations , 1993, Artif. Intell..

[9]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[10]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[11]  Janyce Wiebe,et al.  An Empirical Approach to Temporal Reference Resolution , 1997, EMNLP.

[12]  William C. Mann,et al.  Natural Language Generation in Artificial Intelligence and Computational Linguistics , 1990 .

[13]  Inderjeet Mani,et al.  Multi-Document Summarization by Graph Search and Matching , 1997, AAAI/IAAI.

[14]  Zvi Galil,et al.  Cyclic Ordering is NP-Complete , 1977, Theor. Comput. Sci..

[15]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[16]  Kathleen F. McCoy,et al.  Focus of attention: Constraining what can be said next , 1991 .

[17]  A. Stuart,et al.  Non-Parametric Statistics for the Behavioral Sciences. , 1957 .

[18]  Kathleen McKeown,et al.  Text generation: using discourse strategies and focus constraints to generate natural language text , 1985 .

[19]  Robert Dale Generating referring expressions - constructing descriptions in a domain of objects and processes , 1992, ACL-MIT press series in natural language processing.

[20]  Kathleen F. McCoy,et al.  The Generation of High-Level Structure for Extended Explanations , 1990, COLING.

[21]  Dragomir R. Radev,et al.  Generating Natural Language Summaries from Multiple On-Line Sources , 1998, CL.

[22]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[23]  Eleazar Eskin,et al.  Detecting Text Similarity over Short Passages: Exploring Linguistic Feature Combinations via Machine Learning , 1999, EMNLP.

[24]  Kathleen R. McKeown,et al.  Columbia multi-document summarization : Approach and evaluation , 2001 .

[25]  Claire Cardie,et al.  Multidocument Summarization via Information Extraction , 2001, HLT.

[26]  Regina Barzilay,et al.  Towards Multidocument Summarization by Reformulation: Progress and Prospects , 1999, AAAI/IAAI.

[27]  Kathleen R. McKeown,et al.  Linear segmentation and segment relevence , 1998 .

[28]  Inderjeet Mani,et al.  Robust Temporal Processing of News , 2000, ACL.

[29]  Actress Elizabeth Taylor,et al.  Experiments in Multidocument Summarization , 2002 .

[30]  Donia Scott,et al.  Can text structure be incompatible with rhetorical structure? , 2000, INLG.