A Hybrid Sentence Ordering Strategy in Multi-document Summarization

In extractive summarization, a proper arrangement of extracted sentences must be found if we want to generate a logical, coherent and readable summary. This issue is special in multi-document summarization. In this paper, several existing methods each of which generate a reference relation are combined through linear combination of the resulting relations. We use 4 types of relationships between sentences (chronological relation, positional relation, topical relation and dependent relation) to build a graph model where the vertices are sentences and edges are weighed relationships of the 4 types. And then apply a variation of page rank to get the ordering of sentences for multi-document summaries. We tested our hybrid model with two automatic methods: distance to manual ordering and ROUGE score. Evaluation results show a significant improvement of the ordering over strategies losing some relations. The results also indicate that this hybrid model is robust for articles with different genre which were used on DUC2004 and DUC2005.

[1]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[2]  Regina Barzilay,et al.  Inferring Strategies for Sentence Ordering in Multidocument News Summarization , 2002, J. Artif. Intell. Res..

[3]  Chengqi Zhang,et al.  PRICAI 2004: Trends in Artificial Intelligence , 2004, Lecture Notes in Computer Science.

[4]  Naoaki Okazaki,et al.  Coherent Arrangement of Sentences Extracted from Multiple Newspaper Articles , 2004, PRICAI.

[5]  Dragomir R. Radev,et al.  Introduction to the Special Issue on Summarization , 2002, CL.

[6]  Massimo Marchiori,et al.  The Limits of Web Metadata, and Beyond , 1998, Comput. Networks.

[7]  Alex Lascarides,et al.  Logics of Conversation , 2005, Studies in natural language processing.

[8]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[9]  John D. Lafferty,et al.  Cranking: Combining Rankings Using Conditional Probability Models on Permutations , 2002, ICML.

[10]  Zvi Galil,et al.  Cyclic Ordering is NP-Complete , 1977, Theor. Comput. Sci..

[11]  Rada Mihalcea,et al.  A Language Independent Algorithm for Single and Multiple Document Summarization , 2005, IJCNLP.

[12]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[13]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[14]  Mirella Lapata,et al.  Probabilistic Text Structuring: Experiments with Sentence Ordering , 2003, ACL.