Multi-Sentence Compression: Finding Shortest Paths in Word Graphs

We consider the task of summarizing a cluster of related sentences with a short sentence which we call multi-sentence compression and present a simple approach based on shortest paths in word graphs. The advantage and the novelty of the proposed method is that it is syntaxlean and requires little more than a tokenizer and a tagger. Despite its simplicity, it is capable of generating grammatical and informative summaries as our experiments with English and Spanish data demonstrate.

[1]  Sadaoki Furui,et al.  A Statistical Approach to Automatic Speech Summarization , 2003, EURASIP J. Adv. Signal Process..

[2]  Simon Corston-Oliver,et al.  Text compaction for display on very small screens , 2001 .

[3]  Regina Barzilay,et al.  Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment , 2003, NAACL.

[4]  Rong Jin,et al.  Automatic Title Generation for Spoken Broadcast News , 2001, HLT.

[5]  Kathleen McKeown,et al.  Cut and Paste Based Text Summarization , 2000, ANLP.

[6]  Stephen Wan,et al.  Improving Grammaticality in Statistical Sentence Generation: Introducing a Dependency Spanning Tree Algorithm with an Argument Satisfaction Model , 2009, EACL.

[7]  Michael Strube,et al.  Dependency Tree Based Sentence Compression , 2008, INLG.

[8]  Michael Strube,et al.  Sentence Fusion via Dependency Graph Compression , 2008, EMNLP.

[9]  J. Clarke,et al.  Global inference for sentence compression : an integer linear programming approach , 2008, J. Artif. Intell. Res..

[10]  Michele Banko,et al.  Headline Generation Based on Statistical Translation , 2000, ACL.

[11]  Mirella Lapata,et al.  Modelling Compression with Discourse Constraints , 2007, EMNLP.

[12]  Regina Barzilay,et al.  Sentence Fusion for Multidocument News Summarization , 2005, CL.

[13]  Ryan T. McDonald Discriminative Sentence Compression with Soft Syntactic Evidence , 2006, EACL.

[14]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[15]  Mirella Lapata,et al.  Sentence Compression as Tree Transduction , 2009, J. Artif. Intell. Res..

[16]  Ani Nenkova,et al.  Automatic Summarization , 2011, ACL.

[17]  Dragomir R. Radev,et al.  Adding Syntax to Dynamic Programming for Aligning Comparable Texts for the Generation of Paraphrases , 2006, ACL.

[18]  Emiel Krahmer,et al.  Query-based Sentence Fusion is Better Defined and Leads to More Preferred Results than Generic Sentence Fusion , 2008, ACL.

[19]  Kathleen McKeown,et al.  Lexicalized Markov Grammars for Sentence Compression , 2007, NAACL.

[20]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[21]  Richard M. Schwartz,et al.  Hedge Trimmer: A Parse-and-Trim Approach to Headline Generation , 2003, HLT-NAACL 2003.